P2-22: Temperature of the Great Lakes

By Laura Ring Kapitula (Grand Valley State University)


The Great Lakes are the largest group of freshwater lakes on the planet and contain 21% of the world’s fresh water. Data on the daily temperatures of the individual Great Lakes are stored on the internet in a collection of text files. These data were used in an undergraduate statistical computing course with 30 students in each section that took place in a computer lab for some in-class activities and classroom demonstrations. This posters and beyond session will give an illustration of how these data can be used in a Statistics, Statistical Computing or Data Science course to teach students how to use data that is stored in a variety of text files on the internet to answer questions about the Great Lakes. We will show how these data can be used to teach reading web based data, data concatenation and merging, working with dates, exploring different types of variability, descriptive statistics, data visualization, mapping and how to use a data set to answer specific research questions. We will also illustrate how the Great Lakes surface temperature data can be merged with other data on temperature and precipitation in a city to the west of a Great Lake in order to explore relationships between lake temperatures and local weather and to illustrate how to build predictive models. Participants in this Posters and More session will gain awareness of the availability of this data set and gain computational tools to use this data set in the statistics classroom. Furthermore, we hope that this example inspires teachers to develop other similar examples that are simple enough for lower level students to understand but that use larger and more realistic datasets. The SAS® system will be used to illustrate the methods, but R code is also being developed. Sample computer code and classroom activities can be obtained by emailing the author.