Date of Completion


Embargo Period



Model Selection, Missing Data, Multiple Linear Regression

Major Advisor

Ofer Harel

Associate Advisor

Dipak Dey

Associate Advisor

Nalini Ravishanker

Field of Study



Doctor of Philosophy

Open Access

Campus Access


Model selection is a critical part of analysis of data in applied research. Equally ubiquitous is the notion of incomplete data sets and the challenges presented in analyzing data with missing values. Using principled methods of model selection and handling incomplete data are essential to achieving valid inferences. This dissertation examines model selection in the presence of incomplete data. The first major contribution of this dissertation is an examination of the reliability of several different model selection criteria and a proposal of a functional form for the rate of correct selection. The second area of exploration is development of an F-test in the presence of multiply imputed data sets with an application to a Signs of Suicide prevention study. Finally, this dissertation proposes a model selection criteria for use with incomplete data when the imputation model is fixed and known and examines the effectiveness of this criteria with simulations.