Experts from NJIT, Carnegie Mellon find the pen is mightier than the data
New Jersey Institute of Technology
Plenty of researchers already study how to tell if online writing bears the traits of artificial intelligence — but Michael Laudenbach, at NJIT’s Jordan Hu College of Science and Liberal Arts, is studying what traits indicate that digital prose was crafted by analog humans.
The risk of not knowing the difference is that if instructors and business managers too openly adopt AI writing tools, then children and college students might wrongly learn that AI writing is good writing, Laudenbach said. Even though some people are bad writers, at least what they’re writing is their original thoughts.
Laudenbach was working on his doctoral dissertation using corpus linguistics, the study of analyzing large blocks of text, when the generative AI wave washed into universities at the end of 2022. Large language models such as ChatGPT can produce impressive text to a lay person, but Laudenbach was in the right field at the right time to analyze that text in ways nobody else would.
“With all of the fervor surrounding LLMs, we as researchers need to pause and consider what kinds of writing choices LLMs make, and what specific stylistic choices we risk showing students without reflecting on them. We don't want students to start picking up on the writing style of LLMs because right now, the research tells us that this looks very different from existing human writing,” he said. “So the big takeaway is, instructors and administrators beware — carefully consider which model you use and for which tasks. We can't let the AI hype cycle dictate how we expose students to these tools. We need a careful, evidence-based plan for writing pedagogy.”
Working with colleagues at Carnegie Mellon University, “We generated something like 12,000 texts using six different LLMs and compared that to a corpus of human writing. It’s one of the largest collections of American English that a lot of researchers turn to in this field, and we used a long-established framework to tag linguistic features. So 66 linguistic features, everything from pronouns, type-token ratio, adverbs, participial phrases, lots of grammatical and functional categories. And we were basically able to distinguish between LLM-generated and human-generated output with surprising accuracy on a simple model that we trained,” Laundenbach explained.
Popular AI-checking applications are made by software developers, not linguists, and their approach is to see if the text is computerized or not. Laudenbach and peers have specialized knowledge of human language, so they’re doing the opposite by checking to see if the text is human or not. They also wondered: could AI satisfactorily evaluate a college student’s writing, as if it’s a teaching assistant or writing center tutor? Can AI be good enough to do more important work than checking for grammar, spelling and sentence structure?
“Spoilers: They can't! So far, it looks like they're wildly inconsistent, and even if they weren’t, would we really want AI doing those kinds of tasks?” Laudenbach found. Currently, “If you submit just what the LLM produced, it’s likely not going to be tuned to audience expectations, genre expectations or the rubric that we used in our research. Linguistically, it looks very different. A lot of LLM writing is more noun-heavy and informationally dense.”
“LLMs are an extremely useful research tool for writing pedagogy, because we can use them to help highlight a lot of the features of human writing that make it so nuanced and dynamic. So it's not just about saying the LLMs could do this better, it's about asking what actually are we doing in writing? What are the latent linguistic features that we don't notice that distinguish human and machine writing?”
“It all comes at a time where I think that the LLMs make a better case for rhetoric and composition education, and a better case for discipline-specific writing classes,” Laudenbach said, “because I think in order to be able to evaluate the output, you have to have that knowledge to begin with.”
Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.