I’m a Young Investigator at the Allen Institute for AI on the Semantic Scholar team. My research is in natural language processing and cultural analytics. I’m interested in using computational methods to study stories, healthcare, and online communities and in measuring the reliability of NLP tools when used for social datasets and human research questions.
I earned my PhD in Information Science from Cornell University, where I was advised by David Mimno. I have a master’s degree in Computational Linguistics from the University of Washington and have worked as a research intern at places like Microsoft Research FATE, Twitter Cortex, Facebook Core Data Science, and Pacific Northwest National Laboratory. I’ve been recognized as a “Rising Star” in both computer science and data science.
My past work has examined how postpartum people share and frame their birth experiences, how online book reviewers use and write about genres, and why word vector similarities require additional stability tests when used to measure biases.
I designed and taught NLP for Cultural Analytics for the Linguistics departement at the University of Washington in Winter 2023. You can check out the reading list for the course here.
I’m one of the lead organizers for BERT for Humanists, a series of tutorials and workshops that guide interdisciplinary researchers in using large language models. In addition to our independent tutorials, I’ve led or co-led sessions at ICWSM, Bell Labs, and the popular NLP+CSS 201 tutorial series. I’ve also taught similar public-facing courses for the Hertie School in Berlin and the Brown Institute at Columbia.
I’m a builder and maintainer for cultural analytics tools like Riveter (a tool to measure connotation frames via verb lexicons), the Goodreads Scraper (automated collection of Goodreads book reviews), and Little Mallet Wrapper (a Python wrapper around the topic modeling library MALLET).