Data scraping, ingestation, and modeling: bringing data from cars.com into the intro stats class


Tuesday, November 21st, 20172:00 pm – 3:00 pm

Presented by: Nicholas Horton, Amherst College


Abstract

In this webinar, I will describe a classroom activity where pairs of students hand scrape data from cars.com, ingest these data into R, then carry out analyses of the relationships between price, mileage, and model year for a selected type of car. This early in the semester activity can help illustrate the statistical problem solving process. The "Less Volume, More Creativity" approach utilized by the mosaic package facilitates the analysis with a minimal amount of syntax. Key concepts that are introduced and reinforced including data ingestion, multivariate thinking through graphical visualizations, and regression modeling. Extensions and additional use of the dataset will be discussed along with potential pitfalls.


list