Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Introduction to Text Mining Workshops: Overview

Digital Scholarship Workshop Series

Workshop Series Overview

Interested in learning more about text and data mining (TDM)? 

This is the sequence of four sessions of scaffolded workshops that are meant to culminate in you being able to complete a text mining project.

 The Text Analysis Pedagogy (TAP) Institute Workshops

The 2022 TAP Institute will offer 10 free, online courses to teachers (and aspiring teachers) of text analysis. The courses include:

  • Python Basics
  • Working with Twitter Data
  • A Practical Guide to Text Data Curation
  • Intro to NLP with spaCy
  • Intro to Multilingual NER
  • Intro to Pandas
  • Machine Learning for Humanists
  • Webscraping and Text Analysis in Bilingual Social Media
  • Multilingual Newspaper Data and Visualizations

Related workshops

HathiTrust Research Center (HTRC)  Workshops

These workshops are all standalone, 2 hours long, and will be held virtually  on Zoom.

  • March 17, 1:00 Central / 2:00 Eastern - Intro to HathiTrust and HTRC: Attendees of this workshop will be introduced to the HathiTrust and HathiTrust Digital Library as well as the HTRC and its data and analytical tools, including hands-on practice with HTRC Analytics.
  • March 31, 1:00 Central / 2:00 Eastern - Introduction to the HTRC Extracted Features Dataset: We’ll cover the motivation for its creation, the data model, and the kinds of research it enables, including a hands-on activity using the dataset, Google Colaboratory notebooks and Python code.
  • April 15, 11:00 am Central / 12:00 pm Eastern – Introduction to HTRC Data Capsules: An introduction to the HTRC Data Capsules environment and how it can be used by intermediate and advanced researchers. This session will include a hands-on activity using an HTRC Data Capsule, Jupyter notebooks and Python code.

If you have any questions, feel free to get in touch with Hathitrust  via


Profile Photo
Rafia Mirza