The following schedule runs from June 5 – 9, 2023. (Last January’s institute schedule is available on the
Winter page.) Each day will involve a mix of seminar-style discussion and hands-on technical work. Morning sessions will be 10:00a to 12:30p, afternoon sessions will be 1:30p to 4:00p.
An orientation and tech setup day will be held on May 24, 2023, 2-4PM.
Monday: Introductions & Setting the Stage How are data science methods being used in the humanities? How are humanists studying the automation of decision-making and cultural production? Required Readings
Must be logged into a Princeton institutional Google account to access these PDFs.
Brian Beaton et al., “
Debating Data Science: A Roundtable,” Radical History Review, no. 127 (January 2017): 133–48.
Wai Chee Dimock, “
AI and the Humanities,” PMLA 135, no. 3 (May 2020): 449–54.
Anne Helmreich, Matthew Lincoln, and Charles van den Heuvel, “
Data Ecosystems and Futures of Art History,” Histoire de l’art, no. 87 (June 29, 2021): 45–54.
Barbara McGillivray et al., “
The Challenges and Prospects of the Intersection of Humanities and Data Science: A White Paper from The Alan Turing Institute,” 2020. Morning 10:00 – 10:30 Sierra Eckert and Ryan Heuser, Center for Digital Humanities Discussion on the history of data science and recent trends in computational methods in the humanities. Preview of what will be covered in each day of the Institute Opening Slides 10:30 – 11:30 Participant introductions and project descriptions. What do you hope to learn at this Institute? 11:30 – 12:30 Discussion of readings. Participants: come prepared to discuss how data science is intersecting with scholarship in your discipline. Afternoon 1:30 – 2:00 Peter Ramadge, Gordon Y.S. Wu Professor of Engineering and Director of the Center for Statistics and Machine Learning. Presentation on the data science ecosystem at Princeton. 2:00 – 2:30 Helmut Reimitz, Professor of History; Director, Program in Medieval Studies Tim Geelhaar, SFB 1288 Praktiken des Vergleichens – Dateninfrastruktur and Digital Humanities, University of Bielefeld Jan Odstrčilík, Institut für Mittelalterforschung, Austrian Academy of Sciences Presentation on History Books and the History of the Book in the Middle Ages: applying computational tools to learn about the “history of histories” in medieval Europe. 2:30 – 3:00 Informal discussion. Tuesday: Fundamentals An introduction to the baseline statistical concepts used in machine learning, from distribution to regression. What does it mean to be “unsupervised”? How do we evaluate a model? Readings
Meredith Broussard, “
Machine Learning: The DL on ML,” in Artificial Unintelligence: How Computers Misunderstand the World (Cambridge, Massachusetts & London, England: The MIT Press, 2018).
Wendy Hui Kyong Chun, excerpt from
(Cambridge, Massachusetts & London, England: The MIT Press, 2021). Discriminating Data: Correlation, Neighborhoods, and the New Politics of Recognition
Richard So, “‘All Models Are Wrong,’”
PMLA 132, no. 3 (May 2017): 668–73, https://doi.org/10.1632/pmla.2017.132.3.668.
Tony Chu and Stephanie Yee, “
A Visual Introduction to Machine Learning,” R2D3.
“Introduction to Machine Learning for the Humanities” (2023) Morning 10:00 – 12:30 Sierra Eckert, Perkins Postdoctoral Fellow, Center for Digital Humanities Humanities Data Fundamentals Relevant files: Afternoon 1:30 – 4:00 Amy Winecoff, DataX Fellow, Center for IT Policy & Center for Statistics and Machine Learning Introduction to Machine Learning for the Humanities Relevant files: Wednesday: History of Measurement On the historical epistemology of quantification. How did measurement practices in the past cross the now-accepted divide between the sciences and humanities? Morning 10:00 – 12:30 David Kinney, Postdoctoral Research Associate in Cognitive Science of Values, University Center for Human Values Lecture on the history of measurement across the disciplines Afternoon 1:30 – 4:00 David Kinney Hands-On: Word Embeddings Relevant files: Readings
Hasok Chang, “
Spirit, Air, and Quicksilver: The Search for the ‘Real’ Scale of Temperature,” Historical Studies in the Physical and Biological Sciences 31, no. 2 (2001): 249–84. Thursday: Unpacking Our Code Libraries On the use of industry-created tools for influential humanities scholarship. How did software for speech recognition and image tagging end up in a literary text analysis tool? Readings
Selections from Richard So,
Redlining Culture: A Data History of Racial Inequality and Postwar Fiction (Columbia University Press, 2020).
Ted Underwood, “Machine Learning and Human Perspective,”
PMLA 135, no. 1 (January 2020): 92–109, https://doi.org/10.1632/pmla.2020.135.1.92. Morning 10:00 – 12:30 Sierra Eckert, Perkins Postdoctoral Fellow, Center for Digital Humanities Ryan Heuser, Research Software Engineer, Center for Digita Humanities Useful Natural Language Processing Relevant files: Afternoon 1:30 – 4:00 Sierra Eckert, Ryan Heuser Named Entity Recognition, Geocoding, and Mapping Relevant files: Friday: Participant Projects An opportunity for participants to workshop their own project ideas. Moderated by Natalia Ermolaev and Meredith Martin. Morning 10:00 – 11:00 Roundtable: Computational & Digital Resources on Campus Natalia Ermolaev, moderator Ian Cosden, Director, Research Software Engineering for Computational & Data Science [ slides] Jennifer Grayburn, Assistant Director of Digital and Open Scholarship [ slides] Ben Johnston, Senior Educational Technologist, McGraw Center for Teaching & Learning [ slides] Ariel Ackerly, Makerspace Specialist, Princeton University Library [ slides] Jonathan Halverson, Research Software and Computing Training Lead, Princeton Institute for Computational Science & Engineering 11:00 – 12:30 Participant Project Consultations Grant Wythoff, moderator How can participants apply what they have learned to their own discipline? What research projects or introductory course ideas do participants want to explore? CDH staff will help participants scope their idea to attainable goals while identifying what support or additional training they need. Afternoon 1:30 – 3:00 Participant project consultations, continued. 3:00 – 4:00 Share feedback for next summer’s Institute in this exit survey.