How to detect if a text was written by ChatGPT?

The texts generated by artificial intelligence share certain common features that can be identified. But, if OpenAI intends to develop tools to identify texts written by ChatGPT, there is currently no 100% secure technique.

Students who have their copy written by artificial intelligences. Scammers who use them to write phishing emails. Having stunned the internet with its ability to write human-like texts, ChatGPT AI was quickly used for fraudulent or malicious uses.

New York schools have even banned the use of AI, created by the Californian startup OpenAI. But how to know if ChatGPT generated a text? It’s possible?

Tracks visible when reading

Even without fancy software, certain formulations can give you a clue. For example, the recurring presence of generic and impersonal words instead of rare words or expressions. For example, “the” (el), “it” (eso) or “is” (es) for texts in English.

The reason? Contrary to appearances, ChatGPT is not really a chatbot, but rather an algorithm that calculates the most likely continuation of a text. If you ask him a question, he will understand that the most likely result is an answer, but his answer will be made up of the words that are likely to appear in his database. Or repetitive like “this” or “that,” as MIT Technology Review explains.

One other hint: Since ChatGPT works with probabilities, if multiple people ask you the exact same question, it will generate the same answer for each of them, the most probable, except for a few details. A good hint for teachers: if they correct several copies that look strangely alike, with the same grammatical constructions, the same reasoning, the same examples… They could have been generated by an algorithm. This is what caught the eye of the Lyon professor, half of whose master’s students had used ChatGPT to write his copy.

But unlike many humans, ChatGPT doesn’t make mistakes in French. If the text you are reading contains concordance or grammar errors, then there is a higher chance that it was written by a human.

detection software

As if to accompany the explosion of ChatGPT, more and more sites are claiming to be able to detect the origin of a text with impressive accuracy. But without an explanation of your method, the promises often sound too good to be true.

One of the most transparent and popular tools is GPTZero. This site, developed by computer science student Edward Tian during his Christmas break, uses an approach already used in previous AIs: if an algorithm created it, a similar algorithm will know how to recognize it.

To tell you if it’s from an AI or a human, GPTZero will run your text through an older model of ChatGPT, called GPT-2. “Estimates ‘puzzle’: Does GPT-2 find the text familiar to him? Or is he surprised by the length of sentences or expressions that don’t match the probabilities he learned?” explains Edward Tian to Tech&Co.

So, all you have to do is enter your text on the site, hit enter, and if the “puzzle” is high, the text is more likely human generated. Added to this is burstiness, which assesses how much this perplexity varies across the text: AI-generated sentence lengths won’t vary much across the text, while human sentences will be more random.

no quick fix

But GPTZero cannot 100% confirm that the text comes from a human or a machine. For a simple reason: currently it is impossible.

All current detection tools are imperfect and make mistakes more or less frequently. For example, we asked ChatGPT to type like a 4-year-old (in English): the software alternated between classic sentences and shorter, exclamatory ones. A text varied enough to fool GPTZero.

It’s also easy to manually add errors to an AI-written text, or rephrase it slightly to make it undetectable. This process can even be automated: On twittera computer scientist explains that he has created a program that adds invisible spaces in the middle of certain words, thus transforming them into words unknown to GPTZero, which is therefore “stumped” and will consider the text as a human creation.

Finally, GPTZero is less efficient in languages other than English, because GPT-2 was mainly trained on English texts.

Edward Tian is aware of these flaws. He remembers that his model is currently designed to detect academic cheating. It doesn’t matter that he lets go of the written texts like a 4-year-old.

“Mark” texts generated by upstream AI

These detection methods have another shortcoming: “To calculate the ‘puzzle’ of an algorithm against a text, you need to have a lot of information about the algorithm in question,” says Edward Tian of Tech&Co. “The model itself, the parameters, the weights…”

GPTZero works thanks to GPT-2, a model released by OpenAI in 2019, but which has already been largely surpassed by ChatGPT. “If other companies create even more sophisticated but non-transparent AIs, detection could become much more complicated,” admits Tech&Co’s Edward Tian.

That is why one of the strategies would be to make these AI-generated texts clearly identifiable from their creation, adding a distinctive sign that leaves no room for doubt.

This is the strategy followed by OpenAI: the company intends to tweak its upcoming algorithms to ensure that “whenever GPT generates long text, there is an imperceptible secret signal in its word choices, which you can use to prove later that yes, it is.” of GPT,” according to researcher Scott Aaronson, who recently joined the startup. Both on campus and on the Internet, the persecution has only just begun.

Author: lucas chagnon
Source: BFM TV

Magdalena

Budget: the differential contribution to high incomes is expanded, up to a deficit of less than 3%

Donald Trump travels to Asia and will meet with Xi Jinping, doubt hangs over an interview with Kim Jong-un

Moody’s maintains France’s rating, but gives it a negative outlook due to political instability

The complete C’est Vos Argent for Friday, October 24

How to detect if a text was written by ChatGPT?

Tracks visible when reading

detection software

no quick fix

“Mark” texts generated by upstream AI

Budget: the differential contribution to high incomes is expanded, up to a deficit of less than 3%

Donald Trump travels to Asia and will meet with Xi Jinping, doubt hangs over an interview with Kim Jong-un

Moody’s maintains France’s rating, but gives it a negative outlook due to political instability

The complete C’est Vos Argent for Friday, October 24

Budget: the differential contribution to high incomes is expanded, up to a deficit of less than 3%

Donald Trump travels to Asia and will meet with Xi Jinping, doubt hangs over an interview with Kim Jong-un

Moody’s maintains France’s rating, but gives it a negative outlook due to political instability

The complete C’est Vos Argent for Friday, October 24

Hiring freeze, staff reduction… VSE bosses are angry about political instability, according to a study

LEAVE A REPLY Cancel reply

Editor Picks

The president of the Bundesbank defends new ECB rate hikes beyond what was expected

Von der Leyen points out that new gas pipeline increases European “energy security”

Once adopted, you will no longer be able to do without the Sandisk 128 GB USB key

Latest News

Donald Trump travels to Asia and will meet with Xi Jinping, doubt hangs over an interview with Kim Jong-un

Moody’s maintains France’s rating, but gives it a negative outlook due to political instability

The complete C’est Vos Argent for Friday, October 24

Popular Categories