Call for papers: Teaching reproducibility and responsible workflow (Journal of Statistics and Data Science Education)

https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fnhorton.people.amherst.edu%2Fcall_reproducibility.pdf&data=04%7C01%7Clfb109%40psu.edu%7Cf744561d37ea41ce490d08d91a191001%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C1%7C637569517210810708%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=6%2B3ic1DRvYT8po7FPqmhwLtu0lwBpEQbbTNnNt879ak%3D&reserved=0

Modern statistics and data science utilizes an iterative data analysis process to solve problems and extract meaning from data in a reproducible manner.

Models such as the PPDAC (Problem, Plan, Data, Analysis, Conclusion, https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdataschools.education%2Fabout-data-literacy%2Fppdac-the-data-problem-solving-cycle&data=04%7C01%7Clfb109%40psu.edu%7Cf744561d37ea41ce490d08d91a191001%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C1%7C637569517210810708%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=kVKzqh4wmaAgwZkE6FgFt%2F4DcfIUGsfagy%2F4rdvbcho%3D&reserved=0) Cycle have been widely adopted in many pre-secondary classrooms.

The importance of the data analysis cycle has also been described in guidelines for statistics majors, https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amstat.org%2Fasa%2Feducation%2FCurriculum-Guidelines-for-Undergraduate-Programs-in-Statistical-Science.aspx&data=04%7C01%7Clfb109%40psu.edu%7Cf744561d37ea41ce490d08d91a191001%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C1%7C637569517210810708%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=%2F0abRt1YirjLeWTlqDXxscdRjA6uW%2FXCsg2UI8K8rqI%3D&reserved=0, undergraduate data science curricula, https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdstf.acm.org%2F&data=04%7C01%7Clfb109%40psu.edu%7Cf744561d37ea41ce490d08d91a191001%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C1%7C637569517210810708%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=w0ZGu%2BUzdZYqsxJbtqetTjKsXJ7HN0TzS78UGB2fvkM%3D&reserved=0, and data science courses, e.g., https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fr4ds.had.co.nz%2Fintroduction.html&data=04%7C01%7Clfb109%40psu.edu%7Cf744561d37ea41ce490d08d91a191001%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C1%7C637569517210810708%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=b1gm8rfUacdcgIKOcoejmHo9sr8VxmJU6nZraWBN7XE%3D&reserved=0.

The National Academies of Science, Engineering, and Medicine's 2018 "Data Science for Undergraduates" consensus study, https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fnas.org%2Fenvisioningds&data=04%7C01%7Clfb109%40psu.edu%7Cf744561d37ea41ce490d08d91a191001%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C1%7C637569517210810708%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=lmjszVxvEkFoeXcpPXDXHJftW1AvIufWRcd%2FxE%2BIpcs%3D&reserved=0, identified the importance of workflow and reproducibility as a component of data acumen needed in our graduates.

The NASEM report reiterated that "documenting, incrementally improving, sharing, and generalizing such workflows are an important part of data science practice owing to the team nature of data science and broader significance of scientific reproducibility and replicability."

They also noted that reproducibility and workflow raised important questions about the ethical conduct of science.

These reports identify the need for students to have multiple experiences with the entire data analysis cycle.

However, many challenges exist:

1. technologies are rapidly evolving

2. few faculty were trained in the use of these methods

3. best practices have not been clearly identified

4. insufficient vetted and inclusive curricular materials are available

5. accounting for student heterogeneity and broadening participation

6. many aspects of student understandings in this area are unknown

To highlight work in this important and developing area, the *Journal of Statistics and Data Science Education* is inviting submission of papers related to "Teaching reproducibility and responsible workflow" to appear in a forthcoming issue.

## Sample topics (non-exhaustive)

- Teaching workflows and workflow systems

- Fostering reproducible analysis

- Promoting reproducibility as a component of replicability and scientific conduct

- Developing and implementing documentation and code standards

- Incorporating source code (version) control systems

- Supporting collaboration

- Integrating ethics

- Conducting effective formative and summative assessment

Submissions at all levels of education (primary through graduate programs and continuing education) and disciplines (social sciences, digital humanities, and STEM) are encouraged.

## Timetable

- May 2021 (call for submissions)

- September 1, 2021 (call for reviewers)

- September 15, 2021 (deadline for submissions via the *Journal of Statistics and Data Science Education* submission site https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmc.manuscriptcentral.com%2Fujse&data=04%7C01%7Clfb109%40psu.edu%7Cf744561d37ea41ce490d08d91a191001%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C1%7C637569517210810708%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=WZ1Vtxv6EJZeHSoNa1%2FTt7%2BDA%2FnvYqkYGFxolgt0JPI%3D&reserved=0, please select the "Teaching reproducibility and workflow" option)

- July 2022 (proposed publication date)

Papers received after September are in scope and will be considered as regular submissions.

## About the journal

The *Journal of Statistics and Data Science Education* is an open-access peer-reviewed journal with no author fees that is published by Taylor and Francis and the American Statistical Association.

Articles accepted for publication are promptly made available online and featured on the journal website (https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.tandfonline.com%2Ftoc%2Fujse21%2Fcurrent&data=04%7C01%7Clfb109%40psu.edu%7Cf744561d37ea41ce490d08d91a191001%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C1%7C637569517210810708%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=oZhvYU0Xq6o6KWA3R9RCjZr08vSo5pY4PaZOonEk%2B18%3D&reserved=0).

Questions about submissions or the timeline?  Please contact Nicholas Horton (Amherst College, JSDSE Incoming Editor).




Nicholas Horton
Beitzel Professor of Technology and Society (Statistics and Data Science)
Amherst College

If you receive this outside of your working hours, it is because I am working flexibly in a way that works for me. I respect other working patterns and donít expect replies outside working hours.