In recent years, large language models (LLMs) have advanced to a level where their generated text can be indistinguishable from human-written content. This breakthrough has enabled both incredible use cases and challenges, including the increased potential for misuse, such as disinformation and targeted phishing. While various models claim to detect AI-generated text with high accuracy, few have been rigorously evaluated on a challenging, standardized dataset.
We are thrilled to feature Liam Dugan, the lead author of the influential paper, "RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors."
Liam’s work addresses this gap by introducing RAID—the largest and most comprehensive dataset for evaluating the robustness of machine-generated text detectors. Comprising over 6 million examples spanning multiple domains, decoding strategies, and adversarial attacks, RAID serves as a benchmark for detecting machine-generated text, even under conditions meant to “fool” detection systems. In this session, Liam share insights on RAID's design, the evaluation of various detectors, and his team’s insights on the current and future landscape of text detection.
Liam is a fourth-year PhD student at the University of Pennsylvania advised by Professor Chris Callison-Burch. His research focuses on human and automated detection of AI-generated content. In particular, he is interested in the technical limitations and societal ramifications of detection tools and how we might deploy AI detectors with minimal harm. He maintains both the Real or Fake Text website where people can test how well they can detect generated text and the RAID Benchmark the largest and most challenging dataset for comparing generated text detectors. His work has been published in top conferences such as ACL, EMNLP, and AAAI and has been featured by news organizations such as CNN, ABC News, and in testimony to the U.S. Congress.
©2024 SSI Club. All Rights Reserved