Winter 2023 HDSI Schedule

The following schedule runs from January 9 – 13, 2023. Each day will involve a mix of seminar-style discussion and hands-on technical work. Morning sessions will be 10:00a to 12:30p, afternoon sessions will be 1:30p to 4:00p.



Monday: Introductions & Setting the Stage

How are data science methods being used in the humanities? How are humanists studying the automation of decision-making and cultural production?

Required Readings

*Must be logged into a Princeton institutional Google account to access these PDFs.

Brian Beaton et al., “Debating Data Science: A Roundtable,” Radical History Review, no. 127 (January 2017): 133–48.

Wai Chee Dimock, “AI and the Humanities,” PMLA 135, no. 3 (May 2020): 449–54.

Anne Helmreich, Matthew Lincoln, and Charles van den Heuvel, “Data Ecosystems and Futures of Art History,” Histoire de l’art, no. 87 (June 29, 2021): 45–54.

Barbara McGillivray et al., “The Challenges and Prospects of the Intersection of Humanities and Data Science: A White Paper from The Alan Turing Institute,” 2020.

Morning

10:00 – 11:00
Grant Wythoff and Meredith Martin, Center for Digital Humanities
Discussion on the history of data science and recent trends in computational methods in the humanities.
Preview of what will be covered in each day of the Institute
Opening Slides
11:00 – 12:30
Discussion of readings.
Participants: come prepared to discuss how data science is intersecting with scholarship in your discipline.

Afternoon

1:30 – 2:00
Peter Ramadge, Gordon Y.S. Wu Professor of Engineering and Director of the Center for Statistics and Machine Learning.
Presentation on the data science ecosystem at Princeton.
2:00 – 2:30
Marina Rustow, Khedouri A. Zilkha Professor of Jewish Civilization in the Near East.
Presentation on Handwritten Text Recognition for the Princeton Geniza Project (HTR4PGP), supported by a seed grant from the Center for Statistics and Machine Learning.
2:30 – 3:00
Break
3:00 – 4:00
Verify that all participants are able to open and run a Colab / Python notebook on their laptops.
Sign into your institutional Google account, click the link to colab_intro.ipynb, select “Open with Google Colaboratory” at the top, then run the notebook and discuss.

Tuesday: Fundamentals

An introduction to the baseline statistical concepts used in machine learning, from distribution to regression. What does it mean to be “unsupervised”? How do we evaluate a model?

Readings

Meredith Broussard, “Machine Learning: The DL on ML,” in Artificial Unintelligence: How Computers Misunderstand the World (Cambridge, Massachusetts & London, England: The MIT Press, 2018).

Wendy Hui Kyong Chun, excerpt from Discriminating Data: Correlation, Neighborhoods, and the New Politics of Recognition (Cambridge, Massachusetts & London, England: The MIT Press, 2021).

Richard So, “‘All Models Are Wrong,’” PMLA 132, no. 3 (May 2017): 668–73, https://doi.org/10.1632/pmla.2017.132.3.668.

Tony Chu and Stephanie Yee, “A Visual Introduction to Machine Learning,” R2D3.

Amy Winecoff, “Introduction to Machine Learning for the Humanities” (2023)

Morning

10:00 – 12:30
Sierra Eckert, Perkins Postdoctoral Fellow, Center for Digital Humanities
Humanities Data Fundamentals
Relevant files:

Afternoon

1:30 – 4:00
Amy Winecoff, DataX Fellow, Center for IT Policy & Center for Statistics and Machine Learning
Introduction to Machine Learning for the Humanities
Relevant files:

Wednesday: History of Measurement

On the historical epistemology of quantification. How did measurement practices in the past cross the now-accepted divide between the sciences and humanities?

Morning

10:00 – 12:30
David Kinney, Postdoctoral Research Associate in Cognitive Science of Values, University Center for Human Values
Lecture on the history of measurement across the disciplines

Afternoon

1:30 – 4:00
David Kinney
Hands-On: Word Embeddings
Relevant files:

Readings

Hasok Chang, “Spirit, Air, and Quicksilver: The Search for the ‘Real’ Scale of Temperature,” Historical Studies in the Physical and Biological Sciences 31, no. 2 (2001): 249–84.

Thursday: Unpacking Our Code Libraries

On the use of industry-created tools for influential humanities scholarship. How did software for speech recognition and image tagging end up in a literary text analysis tool?

Readings

Selections from Richard So, Redlining Culture: A Data History of Racial Inequality and Postwar Fiction (Columbia University Press, 2020).

Ted Underwood, “Machine Learning and Human Perspective,” PMLA 135, no. 1 (January 2020): 92–109, https://doi.org/10.1632/pmla.2020.135.1.92.

Morning

10:00 – 12:30
Sierra Eckert, Perkins Postdoctoral Fellow, Center for Digital Humanities
Ryan Heuser, Research Software Engineer, Center for Digita Humanities
Useful Natural Language Processing
Relevant files:

Afternoon

1:30 – 4:00
Sierra Eckert, Ryan Heuser
Named Entity Recognition, Geocoding, and Mapping
Relevant files:

Friday: Participant Projects

An opportunity for participants to workshop their own project ideas. Moderated by Natalia Ermolaev and Meredith Martin.

Morning

10:00 – 11:00
Roundtable: Computational & Digital Resources on Campus
Natalia Ermolaev, moderator
Ian Cosden, Director, Research Software Engineering for Computational & Data Science [slides]
Jennifer Grayburn, Assistant Director of Digital and Open Scholarship [slides]
Ben Johnston, Senior Educational Technologist, McGraw Center for Teaching & Learning [slides]
Ariel Ackerly, Makerspace Specialist, Princeton University Library [slides]
Jonathan Halverson, Research Software and Computing Training Lead, Princeton Institute for Computational Science & Engineering
11:00 – 12:30
Participant Project Consultations
Meredith Martin and Natalia Ermolaev, moderators
How can participants apply what they have learned to their own discipline?
What research projects or introductory course ideas do participants want to explore?
CDH staff will help participants scope their idea to attainable goals while identifying what support or additional training they need.

Afternoon

1:30 – 3:00
Participant project consultations, continued.
3:00 – 4:00
Share feedback for next summer’s Institute in this exit survey.