Date of Completion

4-9-2019

Embargo Period

4-21-2020

Keywords

Baseline Hazard; Cox Model; Online Updating; Proportional Hazards; Survival Analysis

Major Advisor

Elizabeth D. Schifano

Co-Major Advisor

Jun Yan

Associate Advisor

HaiYing Wang

Associate Advisor

see above

Field of Study

Statistics

Degree

Doctor of Philosophy

Open Access

Open Access

Abstract

While studies of the proportional hazards model for big survival data mainly focus on speeding up computation and selecting features from a huge number of covariates, verifying the crucial assumption of proportional hazards (PH) has not been tackled for big data when the data size exceeds a computer’s memory. This dissertation summarizes methodological developments in statistics that address the diagnostics of the PH model, including the PH assumption, functional form, and outlying and/or influential observations. Specifically, an online updating approach with minimal storage requirement that updates the standard test statistic for the PH assumption in an online fashion is proposed. The test and its variant based on most recent data blocks maintain their sizes when the PH assumption holds, and have substantial power when it is violated in different ways. Attention has also been paid to the baseline hazard function of the PH model. Nonparametric methods to compare cumulative baseline hazard curves using profile monitoring techniques, and their combination with parametric methods to detect heterogeneity in data blocks, are presented.

COinS