A New ‘ChatGPT Detector’


The journal Nature has recently been covering how more and more researchers are utilizing ChatGPT to write scientific papers, thus making editorial jobs to distinguish ChatGPT-generated articles from original ones nearly impossible.

In a recent article, the journal cites a University of Kansas study that is published in Cell Reports Physical Science, have created ‘ChatGPT detector’ which said to catch AI-generated papers with almost 98% accuracy.

Talking to the journal, the study’s co-author Heather Desaire, a chemist at the University of Kansas in Lawrence, said, “Most of the field of text analysis wants a really general detector that will work on anything. But by making a tool that focuses on a particular type of paper, we were really going after accuracy.”

The detector was trained on the introductory sections of papers from ten chemistry journals published by the American Chemical Society (ACS). The team chose the introduction because this section of a paper is fairly easy for ChatGPT to write if it has access to background literature, Desaire says. The researchers trained their tool on 100 published introductions to serve as human-written text, and then asked ChatGPT-3.5 to write 200 introductions in ACS journal style. For 100 of these, the tool was provided with the papers’ titles, and for the other 100, it was given their abstracts.

When tested on introductions written by people and those generated by AI from the same journals, the tool identified ChatGPT-3.5-written sections based on titles with 100% accuracy. For the ChatGPT-generated introductions based on abstracts, the accuracy was slightly lower, at 98%. The tool worked just as well with text written by ChatGPT-4, the latest version of the chatbot. By contrast, the AI detector ZeroGPT identified AI-written introductions with an accuracy of only about 35–65%, depending on the version of ChatGPT used and whether the introduction had been generated from the title or the abstract of the paper. A text-classifier tool produced by OpenAI, the maker of ChatGPT, also performed poorly — it was able to spot AI-written introductions with an accuracy of around 10–55%.

Debora Weber-Wulff, a computer scientist who studies academic plagiarism at the HTW Berlin University of Applied Sciences, told the Nature “What the authors are doing is ‘something fascinating‘. Many existing tools try to determine authorship by searching for the predictive text patterns of AI-generated writing rather than by looking at features of writing style. I’d never thought of using stylometrics on ChatGPT.”