Without attracting much attention, OpenAI (ChatGPT) withdrew last July an experimental tool that it had created to detect whether a text had been written by an artificial intelligence. This software was called AI Classifier. The same company that has unleashed the popularization of large language models capable of producing text like a human being warned that this resource was no longer available “due to its low accuracy rate.” It is yet to detect something perhaps more serious: AI photos and videos.
Some of the steps the company has taken since introducing ChatGPT last November show a serious lack of control over some parts of its product. AI Classifier responded to some legitimate concerns. The educational sector is one in which the need to know if a text is produced by a student or a machine arose, but there are more than meets the eye. For example, book publishers around the world face the problem of being able to detect texts, even complete works, if there are authors willing to let the machine save them work.
Although there are companies that sell AI text detector products, they are not very reliable. In the early phases of models like GPT-4 it is impossible for its creators, OpenAI, to know if a text comes from a person or a machine. The company wants to create a kind of watermark, some kind of invisible signal in the text, that allows it to detect what is from the AI, but the language models are known as “black box”. Those who create them do not know exactly how they make their decisions. OpenAI researchers published a paper last May in which they admitted that “language models are becoming more capable and widespread, but we don’t understand how they work.”
Despite the lack of progress, OpenAI notes that it is “investigating more efficient provenance techniques for text.” “We have committed to developing and deploying mechanisms that allow users to understand whether audio or visual content is generated by AI,” they add. AI Classifier was a solution launched amidst the noise, when the educational community began to consider how the arrival of AI is going to change their methods. Until the end of the course, the evaluations did not count on the existence of such a powerful element to falsify the results.
Last May, as reported by Rolling Stone magazine, several seniors from the University of Texas A
These systems are not perfected to detect those texts. OpenAI admitted that its AI classifier was not “totally reliable” because it detected only 26% of machine text, while 9% of human writing was attributed to artificial intelligence.
Last June, researchers from the Department of Computer Science at the University of Maryland published a paper in which they demonstrated “both empirically and theoretically, that various AI text detectors are unreliable in practical situations.” The research proved that even language models protected by schemes considered watermarks “may be vulnerable to spoofing attacks” because a human can add such evidence “to be detected as AI-generated text, potentially causing damage.” to the reputation of its developers”.
The authors state that the ability of language models “to produce quality text that is undetected will most likely increase in the future.” The machine may act deliberately to avoid detection. As models may be asked to write with specific styles and characteristics, catching the liar in this case will require more time.