Maria Antoniak


About Me

I’m an Assistant Professor of Computer Science at the University of Colorado Boulder, affiliated with Information Science and the Boulder NLP Group, where I direct the Culture, Language, and Systems (CLS) Lab. Previously, I was a Young Investigator at the Allen Institute for AI and a postdoc at the Pioneer Centre for AI at the University of Copenhagen. I completed my PhD in Information Science at Cornell University, advised by David Mimno, and my master’s degree in Computational Linguistics at the University of Washington. Along the way, I’ve also spent time at ETH Zürich, Microsoft Research FATE, Twitter Cortex, and Facebook Core Data Science.


Research Interests

I’m a natural language processing (NLP) and cultural analytics researcher who works at the boundaries of computing and the humanities. My lab develops NLP methods to study the language systems that transmit and shape modern culture, from internet platforms to literary archives to language models. We frequently collaborate with interdisciplinary teams across the humanities and social sciences, with recent themes including narratives, science-of-science, and pretraining pipelines. We treat computation as a tool for interpretation, and we work to keep people at the center of our research.


Selected Papers

Characterizing Narrative Content in Web-scale LLM Pretraining Data
Teagan Johnson, Elliott Ash, Andrew Piper, Maria Antoniak
preprint

so much depends / upon / a whitespace: Why Whitespace Matters for Poets and LLMs
Sriharsh Bhyravajjula, Melanie Walsh, Anna Preus, Maria Antoniak
EMNLP (2025)

Research Borderlands: Analysing Writing Across Research Cultures
Shaily Bhatt, Tal August, Maria Antoniak
ACL 2025

Trust No Bot: Personal Disclosures in Human-LLM Conversations
Maria Antoniak*, Niloofar Mireshghallah*, Yash More*, Yejin Choi, Golnoosh Farnadi
COLM 2024

Narrative Paths and Negotiation of Power in Birth Stories
Maria Antoniak, David Mimno, Karen Levy
CSCW 2019

full list of publications



Upcoming Travel & Talks

Jan 2027 Invited to give a keynote address at "Model/Making: Generating Digital Humanities," the annual conference of the Hong Kong Association of Digital Humanities, at the Education University of Hong Kong
Oct 2026 Invited to speak at the NLP for Positive Impact Workshop at EMNLP in Budapest
Oct 2026 Invited to speak at the Cultural Analytics Seminar at UC Berkeley
Sep 2026 Invited to speak at a conference on "Humanistic AI: Generative Models and the Future of Cultural and Historical Interpretation" at Duke University
Jul 2026 Attending IC2S2 in Vermont
Jul 2026 Invited to speak at the Culture x AI Workshop at ICML in Seoul
Jun 2026 Invited as a panelist and participant to the Interdisciplinary Science Summit hosted by Schmidt Sciences at Duke
Jun 2026 Invited to speak at the MAPS and FGVC Workshops at CVPR in Denver
Mar 2026 Attending AtmosphereConf and ATScience in Vancouver
Mar 2026 Invited to speak at the Language Technologies Institute Colloquium at Carnegie Mellon University
Mar 2026 Invited to speak at the NLP Seminar at the University of Pittsburgh
Mar 2026 Invited to speak at the Symposium on AI & Science at Cornell University
Feb 2026 Selected to attend the Artificial Intelligence Humanities Sandpit hosted by UK Research and Innovation in Montreal
Dec 2025 Invited to a seminar on digital literary history at the University of Copenhagen
Dec 2025 Invited to a working group on the "Science of Stories" at the Santa Fe Institute


Outreach

I was one of the lead organizers for AI for Humanists, a series of tutorials and workshops that guide interdisciplinary researchers in using large language models.

I’ve led or co-led sessions at ICWSM, FAccT, Bell Labs, and the popular NLP+CSS 201 tutorial series. I’ve also taught similar public-facing courses for the Hertie School in Berlin, the Brown Institute at Columbia, and the IDEAS Summer School at Northeastern.

I’m the lead builder and maintainer for some cultural analytics tools:


Media


Service

I’m currently serving as an Editorial Board Member for the Journal of Cultural Analytics, Advisory Board Member and Guest Editor for the Computational Humanities Research (CHR) Journal, Advisory Board Member for the Anthology of Computers and the Humanities, Executive Committee Member for ACM FAccT, and member of the Working Group on AI and Research as part of the MLA Task Force on AI in Research and Teaching.

I regularly serve as a (Senior) Area Chair for ACL and related NLP conferences like COLM, and FAccT, and I often review for TACL, the Workshop on Narrative Understanding, and other venues. I’m currently serving as a Tutorials Co-Chair for IC2S2 (2026) and the Publicity Chair for FAccT (2026). Previously I also served as the Publicity Chair for FAccT (2025), a Workshops Co-Chair for ICWSM (2024), and an Ethics Co-Chair for NAACL (2024, 2025).


Teaching

Fall 2026: NLP for Cultural Analytics (University of Colorado Boulder, Computer Science) –> If you need help registering for this course, please send me an email. I welcome students from other departments!

Spring 2026: Introduction to NLP (University of Colorado Boulder, Computer Science)

Fall 2025: NLP for Cultural Analytics (University of Colorado Boulder, Computer Science)

Winter 2023: NLP for Cultural Analytics (University of Washington, Linguistics)