Skip to main content

Module 2. Kinds of Data: Texts, Numbers, Images, and Beyond

Course 2: What is Data? Understanding the Building Blocks of Knowledge
Estimated Time: 30 minutes

🧭 Module Objectives

  • Identify different kinds and formats of data (quantitative, qualitative, structured, unstructured, multimodal).
  • Recognize examples of data in various humanities disciplines.
  • Describe how form and medium influence what can be known from data.
  • Understand the concept of metadata and its importance for context and interpretation.

Not All Data Look Alike

When many people hear "data," they picture numbers in rows and columns. But in the humanities, much of our data are words, images, sounds, objects, or even gestures. Every form of data captures a different dimension of reality and requires a different way of describing, storing, and analyzing it.

At the most general level, we can divide data into two families:

Type Description Example
Quantitative Data Data that can be measured
and expressed numerically.
Census statistics, word
counts, catalog numbers,
GPS coordinates.
Qualitative Data Data that describe
qualities, meanings,
or experiences.
Interview transcripts,
artworks, diary entries,
song lyrics.

Humanists often move between both: we might count how many times a word appears in a text (quantitative) and then interpret what that repetition means (qualitative).

Common Data Forms in the Humanities

Let's look at several data types and media you're likely to encounter:

Category Description Humanities
Example
Common File
Formats
Textual
Data
Words and writing in
any language or
medium
Literary works,
historical documents,
transcribed lyrics
TXT, DOCX, XML,
TEI, JSON
Numeric
Data
Numbers used for
counting, measuring,
or ordering
Census records, catalog
IDs, budgets, artifact
measurements
CSV, XLSX,
SQL tables
Visual
Data
Images, videos, and
visual recordings
Paintings, photographs,
maps, architectural
drawings
JPG, PNG,
TIFF, MP4
Audio
Data
Recorded sound,
music, or speech
Oral histories, field
recordings, song
archives
WAV, MP3,
FLAC
Spatial
Data
Information about
location, distance,
or area
Archaeological sites,
trade routes, cultural
landscapes
GIS, GeoJSON,
KML
Relational
Data
Connections between
entities or ideas.
Letters between
correspondents,
networks of influence
Database:
SQL, Neo4j

These data types often overlap. A digitized manuscript, for example, contains image data (scanned page), text data (transcription), and metadata (who, where, when, why).

Metadata: The Data About Data

Metadata are the contextual details that describe and give meaning to a dataset. They answer questions like:

  • Who created this?
  • When and where was it made?
  • What is it about?
  • How is it related to other materials?

Metadata can be technical (file size, camera type, GPS coordinates) or descriptive (artist, subject, cultural context). For the humanities, metadata are often the most human part of the data, reflecting judgment, vocabulary, and historical understanding.

📘 Example: A song like "War Isn't Murder" by Jesse Welles has:

  • Audio data (the recording)
  • Textual data (the lyrics)
  • Visual data (the music video)
  • Metadata (title, date, performer, recording location, theme, tags)

Together, these make up a rich and layered object of study.

Structured vs. Unstructured Data

Type Description Example Typical Use
Structured Data Organized into fixed
fields and formats,
easy for computers to
read.
A library catalog
record, a museum
database.
Sorting, searching,
analysis.
Unstructured Data Free-form, variable,
human-readable but
harder to process.
A scanned letter,
a journal entry,
a video interview.
Interpretation,
qualitative
analysis.

Many humanities sources are unstructured, which is why digital humanists spend time cleaning, encoding, and modeling them, turning messy realities into data that can be explored and linked while preserving nuance.

Beyond the Binary: Multimodal and Affective Data

Human culture rarely fits neatly into boxes.

  • A performance may include movement, sound, and emotion.
  • A social media post combines text, emoji, hashtags, and video.
  • A museum artifact carries tactile, material, and symbolic dimensions.

Humanists increasingly work with multimodal data, asking not only what it shows but how it feels. This "affective turn" recognizes that data can evoke empathy, awe, or outrage: qualities not easily reduced to numbers, but essential for understanding human experience.

Key Takeaways

  • Data exist in many forms: textual, visual, auditory, spatial, relational.
  • Metadata adds essential context and interpretive depth.
  • Structured and unstructured data require different approaches.
  • Humanities data are often multimodal and meaning-rich.
  • Recognizing form is the first step toward responsible analysis.

Knowledge Check & Reflection

Suggested Readings & Resources

A LOT has been written about data over the past few decades, exploring the term from a range of different perspectives. We cannot provide a comprehensive bibliography on the subject but here are some particularly relevant sources for further reading:

Updated on Nov 6, 2025