Quantitative Methods for Literary and Historical Scholarship -- in Theory and Practice

What is this?

Silicon chip It's two 2-day hands-on training sessions in summer 2022, one at Leeds University in Leeds (21-22 June 2022) and one at De Montfort University in Leicester (6-7 September 2022). The sessions will help literary and historical scholars who want to use quantitative methods (especially computational ones) in their work, but don't know where to start. The sessions are free and participants can attend half a day, a whole day, both days in one town or indeed all four days in the two cities if they want. By 'scholars' we mean anyone working on literary/historical materials, including: undergraduate students; post-graduate and research students; and early, mid, and late-career university tutors. The majority of the sessions will be aimed at those who want to dip their toes into this field, but we will also offer one-on-one assistance for those who've got started and want some guidance with a particular topic or problem.

Why?

A hornbook Traditionally the study of literary and historical documents has largely relied on techniques of close inspection and reading, and qualitative judgement. New quantitative methods of analysis have recently become possible because computing power has become widely available and large bodies of texts (particularly those from the past) have been published in digital form. Take-up of these new methods is uneven.

Many linguists use computers to count particular features of large bodies of writing, but scholars in other humanities fields may be left behind because they do not know how to get started in using quantitative methods and have not seen examples of the benefits. These sessions are aimed at those scholars who want to find out what the new quantitative methods can do for their work.

When and where is it?

The second event took place on on 6-7 September at De Montfort University in Leicester. Clephan Building CL2.29, De Montfort University, Bonners Lane, LE1 5XY, Leicester.

The first two-day training session took place on 21 and 22 June 2022 at the University of Leeds. Parkinson SR B.10, Parkinson Building, University of Leeds, Woodhouse Lane, Leeds, LS2 9JT

Schedule

Based on the stated preferences of attendees who registered by 1 May 2022, we have devised the following schedule of training sessions.

De Montfort University, Clephan Building CL2.29, 6 September 2022

9-9.15am Welcome, with teas/coffees/pastries

9.15-10.30am "Approaching Humanities Problems Quantitatively" (Brett Greatley-Hirsch)

10.30-10.45am Break for tea/coffee

10.45am-12.45pm "Text Mining using only Microsoft Word and Excel" (Paul Brown and Gabriel Egan). Software needed: Word and Excel. We will use the following texts in the session: hard-times.txt and pride-and-prejudice.txt

12.45-1.30pm Lunch (provided)

1.30-2.45pm "Getting Texts from the Right Sources and Preparing Them for Analysis" (Brett Greatley-Hirsch and Ellen Roberts)

2.45-3pm Break for tea/coffee

3-5pm "Introduction to Statistics for Humanists" (Gabriel Egan). Software needed: None

5-6pm "One-on-one sessions to discuss attendees' own data and needs" (Gabriel Egan, Brett Greatley-Hirsch, Paul Brown, and Ellen Roberts)

De Montfort University, Clephan Building CL2.29, 7 September 2022

9-9.15am Welcome, with teas/coffees/pastries

9.15-11.15am "Using Methods from Corpus Linguistics on Historical Texts" (Ellen Roberts). Software needed: Web-browser, email address, account on CQPWeb, spreadsheet program (eg Excel). Files in support of Ellen's talk are at https://github.com/EllenRoberts/QMLHS

11.15-11.30am Break for tea/coffee

11.30am-12.45pm "Data Visualization" (Paul Brown and Ellen Roberts). Software needed: Microsoft excel and a gmail/Google account (a free or paid-for one is fine)

12.45-1.30pm Lunch (provided)

1.30-3.30pm "Introduction to the Programming Language Python" (Paul Brown and Gabriel Egan). Software needed: Web-browser and this file. The scripts we used in the session are here.

3.30-3.45pm Break for tea/coffee

3.45-5pm "Workflow Pipelines with Orange3 and R" (Brett Greatley-Hirsch and Ellen Roberts). Software needed: None

Here is the schedule for the event that took place in Leeds, with links to supporting materials used

Room: Parkinson SR B.10, Parkinson Building, University of Leeds, Woodhouse Lane, LS2 9JT, 21 June 2022

9-9.15am Welcome, with teas/coffees/pastries

9.15-10.30am "Approaching Humanities Problems Quantitatively" (Brett Greatley-Hirsch)

10.30-10.45am Break for tea/coffee

10.45am-12.45pm "Text Mining using only Microsoft Word and Excel" (Paul Brown and Gabriel Egan). Software needed: Word and Excel. We will use the following texts in the session: hard-times.txt and pride-and-prejudice.txt

12.45-1.30pm Lunch (provided)

1.30-2.45pm "Getting Texts from the Right Sources and Preparing Them for Analysis" (Brett Greatley-Hirsch)

2.45-3pm Break for tea/coffee

3-5pm "XML, XPath, XQuery" (Gabriel Egan and Paul Brown). Software needed: Oxygen editor downloaded by participant before the event using free 30-day licence. We will use the following files in the session sonnet1.txt, sonnets.dtd, and sonnets.xml.

5-6pm "One-on-one sessions to discuss attendees' own data and needs" (Gabriel Egan, Brett Greatley-Hirsch, Paul Brown, and Ellen Roberts)

Room: Parkinson SR B.10, Parkinson Building, University of Leeds, Woodhouse Lane, LS2 9JT, 22 June 2022

9-9.15am Welcome, with teas/coffees/pastries

9.15-11.15am "Using Methods from Corpus Linguistics on Historical Texts" (Ellen Roberts). Software needed: Web-browser, email address, account on CQPWeb, spreadsheet program (eg Excel)

11.15-11.30am Break for tea/coffee

11.30am-12.45pm "Applying Particular Methods to Particular Problems" (Brett Greatley-Hirsch)

12.45-1.30pm Lunch (provided)

1.30-2.45pm "Data Visualization" (Paul Brown and Ellen Roberts). Software needed: Microsoft excel and a gmail/Google account (a free or paid-for one is fine)

2.45-3pm Break for tea/coffee

3-5pm "Introduction to Statistics for Humanists" (Gabriel Egan). Materials used: all-Bayes-examples.pdf, Bayes-whiteboard-election.png.

How to register

The sessions are free but you must register. Send an email to Prof Gabriel Egan <gegan@dmu.ac.uk> indicating which day(s) you would like to attend and which session(s) you would like us to offer. Please also tell us a little bit about yourself and the background to your interest in this topic. Each participant is recommended to bring a laptop computer (not a tablet) to use in the training. Please check that you have administrative rights to install software on this laptop so that you can follow along with the instructors' demonstrations and try the methods out for yourself.

Who's doing the training?

The team consists of four scholars who have, between them, 55 years of experience in the use of quantitative methods in humanities scholarship. Here they are:

Gabriel Egan is Professor of Shakespeare Studies and Director of the Centre for Textual Studies at De Montfort University and one of the four General Editors (with Gary Taylor, Terri Bourus, and John Jowett) of the New Oxford Shakespeare, of which the Modern Critical Edition, the Critical Reference Edition and the Authorship Companion appeared in 2016; the remaining two volumes will appear in 2024. He co-edits the academic journal Theatre Notebook for the Society for Theatre Research. He teaches computational approaches to literary-historical textual analysis and the art of letter-press printing.

Brett Greatley-Hirsch is interested in the historical and material conditions of literary production, mediation, and reception; the theory and practice of textual editing in print and digital formats; enumerative bibliography and publishing histories; and the cultural and institutional processes of canon formation. He is the the founder and coordinating editor (with Janelle Jenstad, James Mardock, and Sarah Neville) of Digital Renaissance Editions, which publishes open-access critical editions of Renaissance drama and literature using LEMDO, a platform for developing Endings-compliant, TEI-XML editions.

Paul Brown is a freelance consultant on computational methods for textual scholarship. He has written software to help determine the authorship of early modern plays and has published non-technical explanations to accompany the code. He currently teaches undergraduates how to analyse literary and historical data with computers and has taught many sessions to research students and academics on how to program and on text-mining. He spends most of his time working in the private sector, writing software to automate business tasks, and helping teams use databases to manage their work. He holds a PhD in theatre history from De Montfort University.

Ellen Roberts is PhD student in the Department of Linguistics and English Language at Lancaster University. Her main area of research is historical computational linguistics and her PhD focuses on the linguistic nature of the genre of early modern English dramatic texts. She is interested in how computational methodologies can be applied to literary (and especially historical) texts. In particular, how these methodologies may aid our understanding of how the language functions in these texts.

Who's paying for this (= how come it's free)?

We are pleased to acknowledge financial support for these events from a British Academy Talent Development Award (TDA21\210025), and from De Montfort University and the University of Leeds.

British Academy De Montfort University