This section describes the individual-level demographic and behavioral survey datasets, as well as the specialized quiz responses. You can download the complete, de-identified longitudinal panels below.
This dataset combines survey responses from 203 Notre Dame students across 6 consecutive waves, capturing demographic backgrounds, personality inventories (e.g., Big Five), academic metrics (GPAs), musical and artistic preferences, lifestyle behaviors, and campus engagement.
ImportantSecurity & Privacy Scrubbing
The raw demographic dataset originally contained sensitive participant identifiers. A secure, programmatic R script (Code/scrub_demographics.R) has been run to drop 38 personal identifying columns containing: - Participant first names and last names (firstname_*, lastname_*, v1532, v1180, etc.) - University NetIDs and emails (netid, netid_*, emailaddress_*) - Raw cellular telephone numbers (phonenumber)
1.2 Dataset Quick Profile
Below is a quick programmatic summary of the clean demographic survey data loaded directly from the processed CSV.
With 1,556 attributes, variables are suffixed with a wave number (e.g. _1 for Wave 1, _6 for Wave 6) to track changes over time. Some of the core groupings include:
Demographics: Participant gender (gender_1), ethnicity, family background, and dorm locations.
Personality Scales: Big Five traits (Extraversion, Neuroticism, Openness, Agreeableness, Conscientiousness).
Lifestyles and Habits: Alcohol use, smoking, sleep habits, and smartphone usage metrics (e.g., phoneuse*_1).
Cultural Tastes: Music genres, movies, books, and artistic preferences (e.g., bigband_1, classicrock_1, jazz_1, rap_1).
1.4 Quiz Responses (Master)
In addition to the main demographic surveys, the study administered a series of quizzes to participants. These quizzes capture high-frequency assessments of student health, emotional well-being, political discussions, and feedback on their smartphone performance.
The raw quiz datasets originally tracked participants using raw 10-digit phone numbers. These have been securely mapped to the standard egoid identifier and the raw phone numbers have been completely dropped to maintain anonymity.
The consolidated quiz dataset contains responses from 197 students across 146 specialized survey items, all of which are documented in the Searchable Variable Catalog.