
I’m an Assistant Professor of Computer Science at the University of Colorado Boulder, affiliated with Information Science and the Boulder NLP Group, where I direct the Culture, Language, and Systems (CLS) Lab. Previously, I was a Young Investigator at the Allen Institute for AI and a postdoc at the Pioneer Centre for AI at the University of Copenhagen. I completed my PhD in Information Science at Cornell University, advised by David Mimno, and my master’s degree in Computational Linguistics at the University of Washington. Along the way, I’ve also spent time at ETH Zürich, Microsoft Research FATE, Twitter Cortex, and Facebook Core Data Science.
I’m a natural language processing (NLP) and cultural analytics researcher who works at the boundaries of computing and the humanities. My lab develops NLP methods to study the language systems that transmit and shape modern culture, from internet platforms to literary archives to language models. We frequently collaborate with interdisciplinary teams across the humanities and social sciences, with recent themes including narratives, science-of-science, and pretraining pipelines. We treat computation as a tool for interpretation, and we work to keep people at the center of our research.
Characterizing Narrative Content in Web-scale LLM Pretraining Data
Teagan Johnson, Elliott Ash, Andrew Piper, Maria Antoniak
preprint
so much depends / upon / a whitespace: Why Whitespace Matters for Poets and LLMs
Sriharsh Bhyravajjula, Melanie Walsh, Anna Preus, Maria Antoniak
EMNLP (2025)
Research Borderlands: Analysing Writing Across Research Cultures
Shaily Bhatt, Tal August, Maria Antoniak
ACL 2025
Trust No Bot: Personal Disclosures in Human-LLM Conversations
Maria Antoniak*, Niloofar Mireshghallah*, Yash More*, Yejin Choi, Golnoosh Farnadi
COLM 2024
Narrative Paths and Negotiation of Power in Birth Stories
Maria Antoniak, David Mimno, Karen Levy
CSCW 2019
| Jan 2027 | Invited to give a keynote address at "Model/Making: Generating Digital Humanities," the annual conference of the Hong Kong Association of Digital Humanities, at the Education University of Hong Kong |
| Oct 2026 | Invited to speak at the NLP for Positive Impact Workshop at EMNLP in Budapest |
| Oct 2026 | Invited to speak at the Cultural Analytics Seminar at UC Berkeley |
| Sep 2026 | Invited to speak at a conference on "Humanistic AI: Generative Models and the Future of Cultural and Historical Interpretation" at Duke University |
| Jul 2026 | Attending IC2S2 in Vermont |
| Jul 2026 | Invited to speak at the Culture x AI Workshop at ICML in Seoul |
| Jun 2026 | Invited as a panelist and participant to the Interdisciplinary Science Summit hosted by Schmidt Sciences at Duke |
| Jun 2026 | Invited to speak at the MAPS and FGVC Workshops at CVPR in Denver |
| Mar 2026 | Attending AtmosphereConf and ATScience in Vancouver |
| Mar 2026 | Invited to speak at the Language Technologies Institute Colloquium at Carnegie Mellon University |
| Mar 2026 | Invited to speak at the NLP Seminar at the University of Pittsburgh |
| Mar 2026 | Invited to speak at the Symposium on AI & Science at Cornell University |
| Feb 2026 | Selected to attend the Artificial Intelligence Humanities Sandpit hosted by UK Research and Innovation in Montreal |
| Dec 2025 | Invited to a seminar on digital literary history at the University of Copenhagen |
| Dec 2025 | Invited to a working group on the "Science of Stories" at the Santa Fe Institute |
I was one of the lead organizers for AI for Humanists, a series of tutorials and workshops that guide interdisciplinary researchers in using large language models.
I’ve led or co-led sessions at ICWSM, FAccT, Bell Labs, and the popular NLP+CSS 201 tutorial series. I’ve also taught similar public-facing courses for the Hertie School in Berlin, the Brown Institute at Columbia, and the IDEAS Summer School at Northeastern.
I’m the lead builder and maintainer for some cultural analytics tools:
I’m currently serving as an Editorial Board Member for the Journal of Cultural Analytics, Advisory Board Member and Guest Editor for the Computational Humanities Research (CHR) Journal, Advisory Board Member for the Anthology of Computers and the Humanities, Executive Committee Member for ACM FAccT, and member of the Working Group on AI and Research as part of the MLA Task Force on AI in Research and Teaching.
I regularly serve as a (Senior) Area Chair for ACL and related NLP conferences like COLM, and FAccT, and I often review for TACL, the Workshop on Narrative Understanding, and other venues. I’m currently serving as a Tutorials Co-Chair for IC2S2 (2026) and the Publicity Chair for FAccT (2026). Previously I also served as the Publicity Chair for FAccT (2025), a Workshops Co-Chair for ICWSM (2024), and an Ethics Co-Chair for NAACL (2024, 2025).
Fall 2026: NLP for Cultural Analytics (University of Colorado Boulder, Computer Science) –> If you need help registering for this course, please send me an email. I welcome students from other departments!
Spring 2026: Introduction to NLP (University of Colorado Boulder, Computer Science)
Fall 2025: NLP for Cultural Analytics (University of Colorado Boulder, Computer Science)
Winter 2023: NLP for Cultural Analytics (University of Washington, Linguistics)