Sorry, you need to enable JavaScript to visit this website.

# Regression

• ### Intuitive Biostatistics: Choosing a Statistical Test

This page provides a table for selecting an appropriate statistical method based on type of data and what information is desired from the data. It also compares parametric and nonparametric tests, one-sided and two-sided p-values, paired and unpaired tests, Fisher's test and the Chi-square test, and regression and correlation. It comes from Chapter 37 of the textbook, "Intuitive Biostatistics".
• ### Quote: Fiedler on Forecasting

Forecasting is very difficult, especially about the future. A quote of business economist Edgar R. Fiedler (1929 - 2003) found in "Across the Board", the magazine of The Conference Board, Inc. (June, 1977). The quote also appears in "Statistically Speaking: A dictionary of quotations" compiled by Carl Gaither and Alma Cavazos-Gaither.
• ### Joke: Weighing Trees

A quick pun about the "log scale" by Bruce White
• ### Dataset Example: Forecasting Computer Usage

This article presents a dataset containing actual monthly data on computer usage in Best Buy stores from August 1996 to July 2000. This dataset can be used to illustrate time-series forecasting, causal forecasting, simple linear regression, unequal error variances, and variable transformation. Key Words: Model-building; Seasonal Variation.
• ### Teaching Statistics with Data of Historic Significance: Galileo's Gravity and Motion Experiments

This article describes Galileo's data on falling bodies and projectiles and its use as an aid in teaching polynomial and nonlinear regression. Key Words: Independent and dependent variables; Graphical analysis.
• ### Dataset Example: A Dataset that is 44% Outliers

This article describes a dataset of days in office of US Presidents with outliers that are not mistakes or unusually high or low observations. The data illustrate that outliers need not be errors but could be particularly interesting cases and that data displays may differ in their ability to reveal interesting data structure. Key Words: Inliers; Interpretation in context.
• ### Random Number Generator

This random number generator produces a data table with up to 10 columns and up to 2500 rows. For random integers, users must specify the data range. For data from a Normal (Gaussian) distribution, users specify mean and standard deviation.

• ### Benford's Law Part 2 - The 80/20 Rule or Pareto Principle

This page explores Benford's law and the Pareto Principle (or 80/20 rule). Benford's law may also have a wider meaning if the digits it evaluates are considered ranks or places. The digit's probability of occurring could be considered the relative share of total winnings for each place (1st through 9th). In other words, 1st place would win 30.1%, 2nd place 17.6%, 3rd 12.5%,... 9th place 4.6% of the available rewards. The normalized Benford curve could be used as a model for ranked data such as the wealth of individuals in a country. To determine if the Benford model gives results similar to those of the Pareto principle we use the normalized Benford equation in a computer program.
• ### The Awesome Power of Twenty Questions

This page shows how elements of a systems can be eliminated as causes in problem troubleshooting. The principles of twenty questions are frequently used in the business world to conduct computerized searches of massive data bases. These are called a binary searches and are one of the fastest search methods available. To conduct binary searches, data must be sorted in order or alphabetized. The computer determines which half of the list contains the item. The half containing the item is divided in half again and the process repeated until the item is found or the list can no longer be divided. Problem solvers should avoid focusing on the cause and instead ask which elements of the system can be eliminated as causes.
• ### **The Central Limit Theorem - How to Tame Wild Populations

Using a parameter it's possible to represent a property of an entire population with a single number instead of millions of individual data points. There are a number of possible parameters to choose from such as the median, mode, or interquartile range. Each is calculated in a different manner and illuminates the data from a different point of view. The mean is one of the most useful and widely used and helps us understand populations. A population is simulated by generating 10,000 floating point random numbers between 0 and 10. Sample means are displayed in histograms and analyzed.