# Data Management & Organization

• ### Probabilistic Risk Assessment Procedures Guide for NASA Managers and Practitioners

Probabilistic Risk Assessment (PRA) is a comprehensive, structured, and logical analysis method aimed at identifying and assessing risks in complex technological systems for the purpose of cost-effectively improving their safety and performance. NASA’s objective is to better understand and effectively manage risk, and thus more effectively ensure mission and programmatic success, and to achieve and maintain high safety standards at NASA. This PRA Procedures Guide, in the present second edition, is neither a textbook nor an exhaustive sourcebook of PRA methods and techniques. It provides a set of recommended procedures, based on the experience of the authors, that are applicable to different levels and types of PRA that are performed for aerospace applications.

• ### Using the Bootstrap Method for a Statistical Significance Test of Differences Between Summary Histograms

Dr. Kuan-Man Xu from the NASA Langley Reserach Center writes, "A new method is proposed to compare statistical differences between summary histograms, which are the histograms summed over a large ensemble of individual histograms. It consists of choosing a distance statistic for measuring the difference between summary histograms and using a bootstrap procedure to calculate the statistical significance level. Bootstrapping is an approach to statistical inference that makes few assumptions about the underlying probability distribution that describes the data. Three distance statistics are compared in this study. They are the Euclidean distance, the Jeffries-Matusita distance and the Kuiper distance. "

• ### The Statistics of Visual Representation

This paper comes from researchers at the NASA Langley Research Center and College of William & Mary.

"The experience of retinex image processing has prompted us to reconsider fundamental aspects of imaging and image processing. Foremost is the idea that a good visual representation requires a non-linear transformation of the recorded (approximately linear) image data. Further, this transformation appears to converge on a specific distribution. Here we investigate the connection between numerical and visual phenomena. Specifically the questions explored are: (1) Is there a well-defined consistent statistical character associated with good visual representations? (2) Does there exist an ideal visual image? And (3) what are its statistical properties?"

• ### Using Computers for Statistical Analysis (NASA Activity)

This lesson introduces students to creating spreadsheets for statistical analysis.

• ### The White Glove Test: Discovering Dust in the Solar System (NASA Activity)

The Student Dust Counter is an instrument aboard the NASA New Horizons mission to Pluto, launched in 2006. As it travels to Pluto and beyond, SDC will provide information on the dust that strikes the spacecraft during its 14-year journey across the solar system. These observations will advance our understanding of the origin and evolution of our own solar system, as well as help scientists study planet formation in dust disks around other stars.

In this lesson, students explore the SDC data interface to establish any trends in the dust distribution in the solar system. Students record the number of dust particles, "hits," recorded by the instrument and the average mass of the particles in a given region.

• ### Penn State STAT 506: Sampling Theory and Methods

The aim of this course is to cover sampling design and analysis methods that would be useful for research and management in many field. A well designed sampling procedure ensures that we can summarize and analyze data with a minimum of assumptions and complications. Perfect for both students and teachers wanting to learn/acquire materials for this topic.

• ### Rice Virtual Lab Case Studies

Examples of real data/studies and their analyses and interpretation.

• ### Song: Throw That Out?

A song for use in helping students identify factors to consider when deciding how outliers should be treated, as well as factors for deciding if a study is worthwhile.  Lyrics and music © 2016 by Greg Crowther.This song is part of an NSF-funded library of interactive songs that involved students creating responses to prompts that are then included in the lyrics (see www.causeweb.org/smiles for the interactive version of the song, a short reading covering the topic, and an assessment item).

• ### Data Science for Undergraduates: Opportunities and Options

As our economy, society, and daily life become increasingly dependent on data, work across nearly all fields is becoming more data driven, affecting both the jobs that are available and the skills that are required. At the request of the National Science Foundation, the National Academies of Sciences, Engineering, and Medicine were asked to set forth a vision for the emerging discipline of data science at the undergraduate level. The study committee considered the core principles and skills undergraduates should learn and discussed the pedagogical issues that must be addressed to build effective data science education programs. Data Science for Undergraduates: Opportunities and Options underscores the importance of preparing undergraduates for a data-enabled world and recommends that academic institutions and other stakeholders take steps to meet the evolving data science needs of students.

Watch the report release webinar here:  https://vimeo.com/269033724

• ### Analysis Tool: RStudio Cloud

RStudio Cloud makes it easy for professionals, hobbyists, trainers, teachers and students to do, share, teach and learn data science using R.  Create analyses using RStudio directly from your browser - there is no software to install and nothing to configure on your computer.  Share your projects - and access those of others - without worrying about data transfer or package installation. Each project defines its own environment, and RStudio Cloud automatically reproduces that environment whenever anyone accesses the project.  It’s easy to share analyses with the world - but it’s also simple to collaborate with a select group in a private space. You control who can enter a space - and via roles, you have fine grained control over what each user can do.  There are also many learning materials available: interactive tutorials covering the basics of data science, cheatsheets for working with popular R packages, links to Datacamp courses, and a guide to using RStudio Cloud.