How to Detect AI-Generated Text, According to Researchers

On Feb 8, 2023

AI-generated text, from tools like ChatGPT, is starting to impact daily life. Teachers are testing it out as part of classroom lessons. Marketers are champing at the bit to replace their interns. Memers are going buck wild. Me? It would be a lie to say I’m not a little anxious about the robots coming for my writing gig. (ChatGPT, luckily, can’t hop on Zoom calls and conduct interviews just yet.)

With generative AI tools now publicly accessible, you’ll likely encounter more synthetic content while surfing the web. Some instances might be benign, like an auto-generated BuzzFeed quiz about which deep-fried dessert matches your political beliefs. (Are you Democratic beignet or a Republican zeppole?) Other instances could be more sinister, like a sophisticated propaganda campaign from a foreign government.

Academic researchers are looking into ways to detect whether a string of words was generated by a program like ChatGPT. Right now, what’s a decisive indicator that whatever you’re reading was spun up with AI assistance?

A lack of surprise.

Entropy, Evaluated

Algorithms with the ability to mimic the patterns of natural writing have been around for a few more years than you might realize. In 2019, Harvard and the MIT-IBM Watson AI Lab released an experimental tool that scans text and highlights words based on their level of randomness.

Why would this be helpful? An AI text generator is fundamentally a mystical pattern machine: superb at mimicry, weak at throwing curve balls. Sure, when you type an email to your boss or send a group text to some friends, your tone and cadence may feel predictable, but there’s an underlying capricious quality to our human style of communication.

Edward Tian, a student at Princeton, went viral earlier this year with a similar, experimental tool, called GPTZero, targeted at educators. It gauges the likeliness that a piece of content was generated by ChatGPT based on its “perplexity” (aka randomness) and “burstiness” (aka variance). OpenAI, which is behind ChatGPT, dropped another tool made to scan text that’s over 1,000 characters long and make a judgment call. The company is up-front about the tool’s limitations, like false positives and limited efficacy outside English. Just as English-language data is often of the highest priority to those behind AI text generators, most tools for AI-text detection are currently best suited to benefit English speakers.

Could you sense if a news article was composed, at least in part, by AI? “These AI generative texts, they can never do the job of a journalist like you Reece,” says Tian. It’s a kind-hearted sentiment. CNET, a tech-focused website, published multiple articles written by algorithms and dragged across the finish line by a human. ChatGPT, for the moment, lacks a certain chutzpah, and it occasionally hallucinates, which could be an issue for reliable reporting. Everyone knows qualified journalists save the psychedelics for after-hours.

Entropy, Imitated

While these detection tools are helpful for now, Tom Goldstein, a computer science professor at the University of Maryland, sees a future where they become less effective, as natural language processing grows more sophisticated. “These kinds of detectors rely on the fact that there are systematic differences between human text and machine text,” says Goldstein. “But the goal of these companies is to make machine text that is as close as possible to human text.” Does this mean all hope of synthetic media detection is lost? Absolutely not.

Goldstein worked on a recent paper researching possible watermark methods that could be built into the large language models powering AI text generators. It’s not foolproof, but it’s a fascinating idea. Remember, ChatGPT tries to predict the next likely word in a sentence and compares multiple options during the process. A watermark might be able to designate certain word patterns to be off-limits for the AI text generator. So, when the text is scanned and the watermark rules are broken multiple times, it indicates a human being likely banged out that masterpiece.