A College Kid Built an App That Sniffs Out Text Penned by AI

BOT COP

“Humans deserve to know when the writing isn’t human.”

Tony Ho Tran

Updated Jan. 04, 2023 6:26PM EST / Published Jan. 04, 2023 5:19PM EST

Edward Tian was fast asleep when his bot broke a website.

The 22-year-old senior at Princeton spent his winter break in his local coffee shop creating GPTZero, an app that he claimed would be able to “quickly and efficiently” tell if an essay was written by a human or by OpenAI’s ChatGPT. When he uploaded it to the app creating and hosting platform Streamlit, he didn’t expect it to get that much attention.

“I was expecting, at most, a few dozen people trying out the app,” Tian told The Daily Beast. “Suddenly, it was crazy in usership with over 2000 people signing up for the beta in a few hours.”

GPTZero eventually saw such a massive influx of users that it even crashed the platform that was hosting it. “I’m awestruck that it blew up and went so viral,” he added.

When OpenAI released ChatGPT on Nov. 30, 2022, it unleashed a digital Pandora’s Box on the world.

Everyone—from high school teachers to college professors to journalists—all feared the powerful AI chatbot ushered in a new era of bot-generated essays and articles that some have dubbed “AIgiarism.” Some educators have already begun reporting instances of their students using ChatGPT in order to create essays out of whole cloth and finish writing assignments.

While OpenAI has said that they eventually plan on implementing “watermarks” in order to verify whether or not something was created by ChatGPT, there’s still no official method of doing so—which can create a giant bot-sized headache across all sectors like education and journalism.

Tian, who’s pursuing a double major in computer science and journalism, was bothered by ethical dilemmas posed by chatbots as well as what he described as the “black box” nature of large language models like ChatGPT. The opaque nature of the models results in people fundamentally misunderstanding and, therefore, misusing them.

“Humans deserve to know when the writing isn’t human.”

— Edward Tian, Princeton University

So, even though he is on the cusp of graduating, he decided to spend his winter break building a tool that could help people find out whether or not a piece of writing was likely written by a bot.

“Humans deserve to know when the writing isn’t human,” Tian said. “There’s so much hype around ChatGPT and AI generation lately, that humans deserve to know the truth.”

GPTZero uses two different metrics to assess whether or not a text has been penned by a bot: perplexity, and burstiness. Texts placed into the app will be assigned a number for both metrics. If the number is low, the likelihood of it being created by a bot is higher.

Perplexity is a measurement of randomness in a sentence. If a sentence is constructed or uses words in a way that surprises the app, then it will score higher in perplexity. Tian said that he used the free and open source GPT-2 to help train his app for this metric.

Burstiness is the quality of overall randomness for all the sentences in a text. For example, human writing tends to have sentences that vary in complexity. Some are simple. Some can give James Joyce a run for his money. Bots, on the other hand, tend to generate sentences that are relatively low in complexity, throughout the entire text.

“There are beautiful qualities of human written prose that computers can and should never co-opt,” Tian explained. As a journalism student, he was inspired by a class he took with American writer John McPhee who taught him about those beautiful qualities of human writing.

Tian would go on to use an essay by McPhee in The New Yorker as part of his demo for GPTZero:

Despite building the tool, Tian isn’t anti-AI. He believes that there’s a time and a place for them if used ethically and with consent. Hell, he’s even used AI programs like CoPilot to “support much of my coding.”

“I’m not opposed to using AI for writing when it makes sense,” he said.

With the hype and fears surrounding ChatGPT, a tool like Tian’s could prove to be incredibly useful across sectors from educators who want to see if their student plagiarized an essay, to job recruiters who want to check if a cover letter was actually written by an applicant. As such, it could also be incredibly lucrative to the right investors—some of whom have already reached out to Tian.

“There are beautiful qualities of human written prose that computers can and should never co-opt.”

— Edward Tian

“Just in the past day, a bunch of VCs have slid in my Twitter DMs,” Tian said, including the likes of A16Z, Menlo Ventures, and Red Swan. But he’s not done with GPTZero quite yet. He wants to further refine and develop the app, and he even has plans to expand its transparency with “explainers and detection methodologies.”

And, at the end of the day, he’s a senior in college. He has finals looming, with homework and human-generated essays to worry about. Right now, that’s a much bigger concern than a digital Pandora's Box or VC investors.

“I’m going to take all the calls, but for now,” he said with a laugh, “I’m just a college student focused on graduating from school."

Tony Ho Tran

Got a tip? Send it to The Daily Beast here.

A College Kid Built an App That Sniffs Out Text Penned by AI

Tony Ho Tran

Photo Illustration by The Daily Beast / Getty

Tony Ho Tran