Skip to main content

Survival Analysis of iFixit's Online Question and Answer Forum

Community-driven online question and answer forums (CQA) are becoming increasingly valuable sources of information. These platforms house an expansive amount of crowd- sourced knowledge in the form of thousands of questions and answers posted everyday. There are forums that cover a broad range of topics, like Yahoo! Answers, and forums focused on specific topics, like computer programming-focused Stack Overflow. An example of the latter is iFixit’s Answers forum.

The Performance of Model Averaging Relative to Individual Models for Testing Hormesis

In cancer studies, hormesis is a phenomenon where low doses of a carcinogen reduces the risk of cancer while high doses increase the risk. There are several models to test for hormesis, however some are not flexible enough to detect hormesis. Our research objective was to compare five individual models as well as the method of model averaging (MA).

Golf Handicapping Analysis

Each hole on a golf course is assigned a handicap that has both a technical meaning and a perception of difficulty. Handicap is based in large part on average scoring difference between strong and poor golfers, but it is not necessarily the same as “difficulty”. Handicapping is important to the game of golf, as it affects certain formats of tournament play. Hole difficulty is also important to players as they plan their shots. As a summer research project supported by the Northern Kentucky University UR-STEM program, we investigated factors that play into golf courses’ handicap rating for e

Predicting Flights Delays using the H2O Machine Learning Platform in R

This study aimed to predict departure delay from 2008 to 2016 against year, carrier, air time, distance, week and season through the H2O machine and deep learning platform in R. Three different models were setup. A simple logistic regression was used to predict departure delay over 30 minutes against year, carrier, air time, distance, week and season.

Randomized Gompertzian Growth Models for Solid Rat Tumors

This project is concerned with parameter estimation for growth curves in implanted tumors in rats. The Gompertz curve, an asymmetric sigmoid, and its corresponding first order nonlinear differential equation was used as the model. The model has 3 parameters: r -- a growth rate, K -- the saturation value, the variance parameter in the noise term. The data consisted of measurements every 7 days up to day 77 on two groups of rats, one with initial size 63 mg, and the other 108 mg. This is a subset of the data used in Simpson-Herren and Lloyd, Cancer Chemotherapy Reports 54:143-174 (1970).

A Bayesian Model for the Prediction of United States Presidential Elections

Using a combination of polling data and previous election results, Nate Silver successfully predicted the Electoral College distribution in the presidential election in 2008 with 98% accuracy and in 2012 with 100% accuracy. This study applies a Bayesian analysis of polls, assuming a normal posterior using a conjugate prior. The data were taken from the Huffington Post’s Pollster. States were divided into categories based on past results and current demographics. Each category used a different poll source for the prior.

Modeling the Development of World Records in Track and Field

Assuming that there is a limit in human performance and that there will eventually be a threshold in world records for every track and field event, analytical techniques were used to develop a model of world records over time. The researchers examined various statistical techniques and identified the Gompertz Curve as the model of best fit for predicting the human threshold limit. This presentation demonstrates how the model performs on data for the men’s and women’s 100, 200, and 400 meters, long jump and shot put.