A useful way of approaching a statistical problem is to consider whether the addition of some missing information would transform the problem into a standard form with a known solution. The EM algorithm (Dempster, Laird, and Rubin 1977), for example, makes use of this approach to simplify computation. Occasionally it turns out that knowledge of the missing values is not necessary to apply the standard approach. In such cases the following simple logical argument shows that any optimality properties of the standard approach in the full-information situation generalize immediately to the approach in the original limited-information situation: If any better estimate were available in the limited-information situation, it would also be available in the full-information situation, which would contradict the optimality of the original estimator. This approach then provides a simple proof of optimality, and often leads directly to a simple derivation of other properties of the solution. The approach can be taught to graduate students and theoretically-inclined undergraduates. Its application to the elementary proof of a result in linear regression, and some extensions, are described in this paper. The resulting derivations provide more insight into some equivalences among models as well as proofs simpler than the standard ones.
The CAUSE Research Group is supported in part by a member initiative grant from the American Statistical Association’s Section on Statistics and Data Science Education