Datasets
Primary Datasets
Drug Consumption
Real data on 1,876 participants’ demographics, personality traits, and substance use. Used in Walkthrough #1 to predict cocaine use (Coke
, classification).
California Housing
Real data on 2,000 blocks in California from 1990 about houses, population, and location. Used in Walkthrough #2 to predict median house value (house_mdn_value
, regression).
Additional Datasets
Airline Satisfaction
Real data on the satisfaction and experience of 10,000 customers of an airline. Can be used to predict satisfaction status (satifaction
, classification).
Titanic Disaster
Real data on 1,309 passengers on the Titanic. Can be used to predict survival (survived
, classification) or ticket price (fare
, regression).
Water Potability
Real data on the potability and chemical properties of 2,011 water bodies. Can be used to predict safety to drink (Potability
, classification).