Research Topic: Assessment

What is the historical context for assessment?

Statistics instructors need to identify and assess the key learning objectives to be mastered in a course because students value what is assessed. How to effectively assess that these objectives have been mastered is the shared aim of the countless assessment tools that have permeated the literature over the past decade. Although assessing the cognitive domain is important in its own right, the broader purpose of assessment is to assist learning, measure individual student achievement, and evaluate programs (Garfield et al., 2008).

When determining what to assess in an introductory statistics course, the conversation has honed in on three main constructs: statistical literacy, statistical reasoning, and statistical thinking. Although there is some ambiguity surrounding the precise definitions of these terms, it is almost universally accepted that statistics educators should be striving to foster statistical literacy, reasoning, and thinking in their courses. Garfield et al. (2008) outline these three main constructs as follows:

  • Statistical Literacy: the ability to comprehend and communicate using the language of statistics.
    • Examples include: understanding basic statistical definitions and symbols, interpreting data representations 
  • Statistical Reasoning: the ability to reason with, and understand, statistical ideas. Requires a deeper level of understanding than statistical literacy.
    • Examples include: establishing connections between ideas, communicating statistical processes and phenomenons
  • Statistical Thinking: the ability to think like a professional statistician. Requires a deeper level of understanding than statistical reasoning (and thus, statistical literacy).
    • Examples include: understanding why a certain statistical process/method is used instead of just how to use it, understanding the limitations of statistical processes, being able to communicate findings in the context of the problem

To effectively measure the amount of statistical literacy, reasoning, and/or thinking gained in an introductory statistics course, many of the assessment tools discussed hereafter use a pre- and post- test format. It is also commonplace for the author(s) of the assessment tool to examine the reliability, validity, and fairness of the tool to validate its use.

What might assessment look like in the classroom?

  • Using a validated assessment tool as part of formative assessment to gauge student understanding and/or guide future instruction
  • Using a validated assessment tool as part of summative assessment to determine final exam grades
  • Using a validated assessment tool in order to evaluate a department or program’s success in preparing students to think statistically
  • Implementing alternative assessments into a course, including, but not limited to, quizzes, (oral) exams, research projects, writing assignments, lab reports, minute papers, journal writing

What research on assessment has been done?

One of the key researchers in this field of study is Beth Chance, whose 2002 article revived the conversation on statistical thinking and how to effectively teach and assess this skill (Chance, 2002). In 2006, a group of researchers led by Joan Garfield and Bob delMas from the University of Minnesota, developed the ARTIST (Assessment Resource Tools for Improving Statistical Thinking) project to provide assessment resources and items that would reliably assess statistical literacy, reasoning and thinking. These researchers were responsible for developing multiple assessment tools: most notably, CAOS (Comprehensive Assessment of Outcomes in Statistics), but also SRA (Statistical Reasoning Assessment) and SCI (Statistics Concept Inventory). CAOS is a reputable assessment tool consisting of thinking and reasoning (about variability) assessment questions intended for students in an introductory statistics course. Based on results from a pilot study, these researchers determined that students do not demonstrate understanding on many of the fundamental learning objectives covered in the CAOS assessment, and thus, changes must be made to the curriculum and instruction to better prepare students to effectively think and reason about statistics (delMas, Garfield, Ooms and Chance, 2007). These findings prompted the development of a series of assessment tools and re-ignited the conversation surrounding assessment. 

The University of Minnesota (Educational Psychology department) has a long tradition of producing research related to measurement and assessment for an introductory statistics course. Thus, many of the researchers mentioned in this sphere are in some way connected to this program. In 2008, Garfield et al published a chapter in their seminal book “Developing Students’ Statistical Reasoning: Connecting Research and Teaching Practice” devoted solely to the role of assessment in statistics education (Garfield et al, 2008). This chapter is frequently referenced in research surrounding assessment in the statistics classroom. Under the tutelage of Garfield (and delMas), Andy Zieffler contributed to the development of the MOST (Models of Statistical Thinking) assessment to assess students’ statistical thinking (Garfield, delMas and Zieffler, 2012), Laura Ziegler developed the BLIS (Basic Literacy in Statistics) assessment to assess students’ statistical literacy after taking an SBI-methods introductory statistics course (Ziegler, 2014; Ziegler and Garfield, 2018) and Matthew Beckman developed the I-STUDIO assessment to measure statistics students’ cognitive transfer outcomes (Beckman, 2015). Furthermore, Garfield, along with Anelise Sabbag and Andy Zieffler, introduced the GOALS (Goals and Outcomes Associated with Learning Statistics) instrument in 2015, which was intended to assess students’ statistical reasoning abilities (Sabbag, Garfield and Zieffler, 2015).

Although most of the aforementioned assessment tools focus on assessing only one of the three main constructs, they can be used in conjunction with one another to effectively assess all three (statistical literacy, reasoning, and thinking). There has also been a recent interest in developing assessment tools that measure more than one construct simultaneously. On this front, Garfield, Sabbag, and Zieffler developed REALI (Reasoning and Literacy Instrument) to concurrently measure students’ statistical literacy and reasoning abilities in order to identify how these constructs might be related (Sabbag, Garfield and Zieffler, 2018). Moreover, since these assessment tools were primarily developed for use in an introductory statistics course, there is a visible need for researchers to develop instruments that can be used to assess statistical literacy, reasoning, and thinking in upper-level statistics courses. Many modern researchers, including Alison Theobold, feel that alternative assessment methods, such as research projects and oral exams, can be instrumental in fostering and assessing this type of thinking at any course level (Theobold, 2021). Assessment will continue to be a prominent area of research as statistics education researchers cultivate the most effective way to design instruction to enhance students’ overall learning experience. Since students tend to value what is assessed, statistics instructors must come to an agreement on how to best measure and assess what is valued.

What are some recent theses/dissertations dealing with assessment?

  • Ziegler,  L. (2014). Reconceptualizing  statistical  literacy:  Developing  an  assessment  for  the  modern introductory statistics course (Unpublished doctoral dissertation). Retrieved from the University of Minnesota Digital Conservancy,
  • Beckman, M. (2015). Assessment Of Cognitive Transfer Outcomes For Students Of Introductory Statistics.

What are some ideas or research questions for starting to explore assessment?

  • Which assessment tool should be used to best model the relationship between the constructs of statistical literacy and reasoning
  • Further differentiate between students of varying ability levels when analyzing these assessment tools
  • How do the results of these assessment tools differ when used on students in upper-level, as compared to introductory, statistics courses
  • Construct, and validate, an instrument that contains questions of higher difficulty levels
  • Collect demographic information to determine if any Differential Item Functioning exists
  • Investigate the effect of the amount of SBI methods incorporated in the course on student performance (as measured using one of the validated assessment tools)
  • Administer assessment tools to pre-service teachers to determine level of preparedness to teach statistical literacy, reasoning, and thinking to students
  • Measure retainment of statistical literacy, reasoning, and thinking skills
  • Effective ways to design assessments to properly assess student learning on objectives
  • Collect longitudinal data that is not tied to the pre/post format. This might mean collecting data in the middle of the course, prior to the course beginning, or even months after the course has completed.

What are some recent assessment tools that have been developed (within the last 5-15 years)?

Assessment ToolAuthorDescriptionWho to Contact to Gain Access
Basic Literacy in Statistics (BLIS)Laura Zeigler (and Joan Garfield, Michelle Everson)Questions pertaining to statistical literacy for students taking a college-level introductory statistics course, whose curriculum uses some SBI methodsEmail Laura at to obtain Qualtrics form with all questions to the BLIS assessment included
Comprehensive Assessment of Outcomes in Statistics (CAOS)Joan Garfield, Bob delMas, Ann Ooms, and Beth ChanceQuestions pertaining to statistical thinking and reasoning for students taking a college-level introductory statistics courseRegister at the following link: to obtain access to the CAOS assessment
Goals and Outcomes Associated with Learning Statistics (GOALS)Joan Garfield, Anelise Sabbag, and Andy ZiefflerQuestions pertaining to statistical reasoning for students taking a college-level introductory statistics course 
I-STUDIOMatthew Beckman (and Joan Garfield, Bob delMas)Questions intended to measure statistics students' cognitive transfer outcomesThe I-STUDIO assessment tool is available by request from the author or advisor(s)
Levels of Conceptual Understanding in Statistics (LOCUS)Tim JacobbeQuestions intended to measure statistics students' (grades 6-12) conceptual understanding and problem-solving skills, consistent with the GAISE guidelinesEmail Tim Jacobbe at with any questions regarding the assessment tool
Reasoning and Literacy Instrument (REALI)Joan Garfield, Anelise Sabbag, and Andy ZiefflerQuestions pertaining to statistical literacy and reasoning, concurrently, for students taking a college-level introductory statistics course 

What are the best practices to follow when using an assessment tool?

  • Validity is not an intrinsic property of the instrument. Instead, validity refers to the appropriateness of specific uses of the scores from an instrument.
    • Just because an instrument has been “validated” does not mean it can be used without thought. Any changes to the instrument, the target population, or the intended use requires documentation of evidence to support the changes. 
  • As a general rule of thumb, be sure to use the full instrument, not individual items, when implementing an instrument in your classroom. 
    • The reliability, validity, and fairness measures that validate the use of the instrument are on the aggregate of items, not on the individual items.
  • Always ask for permission from the author(s) before using an instrument in your classroom.
    • When in doubt, it is always best to reach out to the author(s) of the instrument to receive permission for use and/or access to their instrument

What are the best practices to follow when developing a new instrument or assessment tool?

The instrument development procedure will look slightly different for each tool, based on the principal investigator’s preferences and available resources. However, there are some generally agreed upon components that should be included in any successful instrument development procedure (Beckman, 2015; Ziegler and Garfield, 2018; Sabbag, Garfield, and Zieffler, 2018; Bond et al., 2021).

  • Phase 1: Prototype/Blueprint Development
    • Identify the constructs to be measured and the population of interest
    • Develop and describe the framework to be used 
    • Outline the topics and items that will be included in the comprehensive assessment
    • Get initial feedback from statistics educators/expert reviewers
  • Phase 2: Develop the Assessment Tool
    • Write the items to be included in the comprehensive assessment, focusing more on conceptual understanding than computation
    • Have statistics educators/expert reviewers provide feedback on the items
    • Have students take the assessment preliminarily (coupled with interviews, if applicable), and use student responses to develop meaningful distractors and amend items
    • Revise the items on the assessment based on educator and student feedback obtained
    • (This phase can be repeated as many times as deemed necessary)
  • Phase 3: Pilot Study
    • Administer the assessment to the intended population (grade level, demographic, etc.)
    • Discuss evidence of reliability, validity, and fairness from the pilot study
    • Argue for its widespread usefulness based on the pilot study
  • Phase 4: Field Test
    • Administer the assessment to a much larger population in order to gather more data and conduct a more comprehensive analysis

Key Articles (as cited throughout)

Early Research on Assessment

  • Chance, B. L. (2002). Components of statistical thinking and implications for instruction and assessment. Journal of Statistics Education, 10(3).
  • delMas, R., Garfield, J., Ooms, A., & Chance, B. (2007). Assessing students’ conceptual understanding after a first course in statistics. Statistics Education Research Journal, 6(2), 28-58. 
  • Garfield et al (2008). Assessment in Statistics Education. Developing Students' Statistical Reasoning.  (CHAPTER 4) Springer.
  • Garfield, J., & delMas, R. (2010). A web site that provides resources for assessing students’ statistical literacy, reasoning and thinking.Teaching Statistics, 32(1), 2–7.
  • Garfield, J., & Franklin, C. (2011). Assessment of learning, for learning, and as learning in statistics education.  In C.Batanero, G.Burrill,  & C.Reading  (Eds.),Teaching statistics  in  school mathematics-challenges  for  teaching  and teacher  education:  A  joint  ICMI/IASE  study (pp. 133–145). New York: Springer.

Middle Research on Assessment

  • Garfield, J., delMas, R., & Zieffler, A. (2012). Developing statistical modelers and thinkers in an introductory, tertiary-level statistics course. ZDM, 44(7), 883–898. (Links to an external site.)
  • Ziegler,  L. (2014).Reconceptualizing  statistical  literacy:  Developing  an  assessment  for  the  modern introductory statistics course (Unpublished doctoral dissertation). Retrieved from the University ofMinnesota Digital Conservancy,
  • Beckman, M. (2015). Assessment Of Cognitive Transfer Outcomes For Students Of Introductory Statistics.
  • Sabbag, A. G., Garfield, J., & Zieffler, A. (2015). Quality Assessments in Statistics Education: A Focus on the GOALS Instrument. In Advances in Statistics Education: Developments, Experiences, and Assessments. Proceedings of the Satellite Conference of the International Association for Statistical Education (IASE). MA Sorto, available at
  • Whitaker, Douglas, Steven Foti, and Tim Jacobbe. "The Levels of Conceptual Understanding in Statistics (LOCUS) Project: Results of the Pilot Study." Numeracy 8, Iss. 2 (2015): Article 3. DOI: 10.5038/1936-4660.8.2.3 
  • Ziegler, L., & Garfield, J. (2018). Developing a statistical literacy assessment for the modern introductory statistics course. Statistics Education Research Journal, 17(2), 161-178

Late Research on Assessment

  • Sabbag, A., Garfield, J., & Zieffler, A. (2018). Assessing statistical literacy and statistical reasoning: The REALI instrument. Statistics Education Research Journal, 17(2), 141-160.
  • Bond, M., Batakci, L., Kerby-Helm, A., Unfried, A., & Whitaker, D. (2021, July). SOMAS/DS: Measuring the learning environment, the instructor, and the student. Poster presented at the United States Conference on Teaching Statistics (USCOTS) 2021, Virtual.
  • Theobold, A. S. (2021). Oral Exams: A More Meaningful Assessment of Students’ Understanding, Journal of Statistics and Data Science Education, 29:2, 156-159, DOI: 10.1080/26939169.2021.1914527

The CAUSE Research Group is supported in part by a member initiative grant from the American Statistical Association’s Section on Statistics and Data Science Education