Page 70 - 2025S
P. 70
UEC Int’l Mini-Conference No.54 63
Detecting Modified AI-Generated Academic Abstracts
1
Andrew TRUONG and Akira UTSUMI 2
1 UEC Exchange Study Program (JUSST Program)
2 Artificial Intelligence eXploration Research Center
The University of Electro-Communications, Tokyo, Japan
Keywords: AI-generated content detection,Large language models (LLM), Academic writing integrity,
Paraphrasing detection, Linguistic pattern analysis
Abstract
The rapid advancement of large language models has created unprecedented challenges for detecting
AI-generated content, particularly when modified through paraphrasing techniques to evade detection
systems. This study investigates the linguistic patterns that persist in AI-generated academic abstracts
even after systematic modification, focusing on citation behaviors and structural consistency across
multiple models and domains. We developed a comprehensive dataset of 4,000 academic abstracts com-
prising 1,000 human-written samples from JSTOR and 3,000 AI-generated samples from four leading
models (Claude Sonnet 4, ChatGPT 4o mini, Gemini 2.5 Pro, and Copilot). The AI samples included
1,000 original abstracts, 1,000 QuillBot-paraphrased versions, and 1,000 manually modified texts across
30 academic topics spanning computer science, medicine, engineering, and other disciplines. Our analy-
sis framework employed five feature categories: citation patterns, syntactic complexity, lexical diversity,
consistency patterns, and academic discourse markers. Additionally, all AI models demonstrated identi-
cal structural templates regardless of topic domain, with 85% phrase overlap across different academic
fields. AI abstracts consistently followed a formulaic 6-step pattern using expressions like ”Recent
studies show that...” and ”Previous research has established...” despite topic variation. These findings
suggest that citation analysis provides a highly effective detection method for AI-generated academic
content, while structural consistency patterns offer robust features resistant to paraphrasing modifica-
tions. The results have significant implications for academic integrity systems and automated content
verification in scholarly publishing.