"Student Perspectives on Software Used in an Introductory Statistical Computing Course"
Chelsea Snyder & Julia L. Sharp, Clemson University
Competence with statistical software packages is an asset to applicants currently seeking employment in applied statistics. Baglin and Da Costa (2013) discuss the connection between statistical literacy and competence with statistical software. With statistical computing skills, students can develop practical ingenuity to apply to real-world applications. Statistical computing is not only helpful for the purpose of job preparation, but also students are more engaged by working with real-life data sets. In a study conducted by Neumann, Hood, and Neumann (2013), 63% of students interviewed indicated that using real data in their statistics class gave real-life relevance to what they were learning in class. Accordingly, colleges and universities are modifying the curricula of their statistics courses to include greater use of technology. Statistical computing courses are taught in a variety of formats. Courses are taught at both the undergraduate and graduate levels. At some colleges and universities, statistical computing courses focus upon one software program, whereas at others, two or three statistical programs are introduced. The focus of these courses is either the programming aspect of the software, or the data analysis component, in which existing software functions and packages are utilized (Broman, Caffo, Irizarry, Peng, and Ruczinski, 2004; Christian, 2011; Gentle, 2013; Hofmann, 2013; Kim, 2004; Maboudou, 2011; Paciorek, 2011; Peng, 2014; Shalizi, 2013).
We examine an introductory statistical computing course offered in the Clemson University Mathematical Sciences department that exposes both graduate and undergraduate students to several statistical software programs and LaTeX. The course content has traditionally been focused on using SAS and R for importing data, data manipulation, basic descriptive statistics and graphical procedures, and inference for a single mean. Additionally, students learn to create a simple document in LaTeX comprised of sections, tables, and figures. We investigate whether the software programs focused upon in this course provide students with learning experiences that best prepare them for statistical software use in their jobs and other coursework. We implemented two surveys to gather data pertaining to our study goals. Prior to taking the course, students were surveyed to gain information about their software proficiency and interest, as well as computer science, database, and LaTeX exposure and experience. After taking the course, students were asked about their statistical software use and proficiency, software usefulness in their jobs, and recommendations about software packages to emphasize in future semesters. The pre-course survey was only given to students who took the course in 2011 and 2012, however, the post-course survey was sent to students who took the course since course inception in 2008.
Prior to taking the course, students indicated that they were comfortable with and used Microsoft Excel more frequently than other statistical computing programs. Graduate students indicated their use of R and Minitab more than undergraduate students. However, graduate students did not feel that their skills with R and Minitab were proficient. Overall, those surveyed indicated a desire to learn SAS and R in the course. After taking the course, students felt most proficient with Microsoft Excel and SAS. Moreover, students currently use Microsoft Excel most often among the statistical computing programs, but they indicated that learning SAS prepared them for its use in their current positions. Students recommended that Microsoft Excel, R, and SAS be taught in future semesters.
We conclude that students are best prepared for later coursework and jobs if SAS, R, and Microsoft Excel are the programs most emphasized by teachers of the course. Indeed, in 2009, 92 of the 100 largest companies worldwide used SAS software in some capacity (Lohr, 2009). Further, R is gaining popularity among statisticians (Vance, 2009). As a result, students would be well-served to learn both R and SAS as they prepare for a future in applied statistics.
- Download slides (PDF)
(Tip: click the fullscreen control)
Having trouble viewing? Try: Download (.mp4)
(Tip: right-click and choose "Save As...")