Democratizing Data (Science): Empowering and Expanding Opportunities for Both Students and Educators

Rebecca Nugent (Carnegie Mellon University)


While "data science" is all over the press and LinkedIn job postings and driving students in droves to our classes and degree programs, its full potential to expand opportunities for the broader population is still not being met. Largely because the focus tends to be on acquiring technical skills and programming languages. While these are important and foundational to the discipline, the most influential and valuable asset continues to be people. Every decision, every action, every idea comes from a person with their own background knowledge, expertise, interests, behavioral tendencies, and personal experiences that not only enrich their statistical and data-centric work but imprint it with their own statistics and data science personality. Give 100 people the same data set and research question, you're likely to get 100 (slightly? very?) different data analysis workflows. This is a good thing. Everyone can be a "data scientist".

What we do with or about this variation is the question. Can we empower everyone, truly, everyone to use data and statistics in their work and/or their lives by personalizing and adapting how we teach? By giving students the power to explore data on their own? By building a community of tools and content that empower educators with limited bandwidth and agency to make changes in their courses? How do we optimize and teach collaborative data science? What is the best way to build a data science team? How should we do that in the (virtual/hybrid) classroom?

In this talk, we'll give an overview of several data-related educational initiatives that have focused on understanding how different populations interact with data, statistics, and data science, including ISLE, the Integrated Statistics (Subject) Learning Environment, a browser-based e-learning platform that allows students and educators to interact with data without programming requirements and collaborate on open-ended data analysis problems. We'll also discuss recent data science-related studies and recommendations from the National Academy of Sciences, including where academia and educators are most needed. If time permits, we'll talk about some statistics and data science outreach initiatives focusing on broadening and deepening the opportunity pipeline for both students and educators and share some success stories and some of our many lessons learned.