Date of Completion

8-4-2019

Embargo Period

8-4-2022

Keywords

Tweedie Model, zero-inflated model, CAR model, GP model, WAIC, LOO

Major Advisor

Dipak K. Dey

Associate Advisor

Victor Hugo Lachos Davila

Associate Advisor

Xiaojing Wang

Field of Study

Statistics

Degree

Doctor of Philosophy

Open Access

Open Access

Abstract

With nonnegative support and discrete mass at zero, the Tweedie model becomes a popular method to analyze the continuous zero-spike data, which is fairly common in fields of insurance, biostatistics, epidemiology, economics, and geography. Several research articles had discussed other approaches to model data with mass zeros, especially the zero-inflated exponential family of distributions, including the zero-inflated gamma model and the zero-inflated log-normal distribution. However, no systematical comparison has been conducted to select the best model for the excess-zero data, which motivated us to propose a model comparison between the Tweedie model and other zero-inflated models under the Bayesian framework.

Moreover, the traditional Tweedie model will not take the spatial pattern into account while the spatial correlation is rather unexceptional in this type of data. Motivated by this issue, we propose the Tweedie model with the conditional autoregressive (CAR) prior and the Gaussian process (GP) prior. By constructing the neighborhood system based on the shape of lattice, the CAR model can determine the spatial dependence structure whilst retain the discrete index nature of lattice data. On the other hand, the GP prior measure the spatial correlation via figuring out the geographic distance between subregions.

We model means of the Tweedie model in a logarithmic scale to connect covariates and the unobserved random effects, which contains the spatial association and are subject to the CAR prior or the GP prior. The Bayesian approach to our proposed model is discussed to avoid the sophisticated mathematical calculations caused by the Tweedie’s complicated density function. The sensitivity analysis is presented to select the optimal priors for our models with different prior settings. Furthermore, we implement a simulation study to assess the performances of our models with different spatial priors and apply the optimal model to the insurance data about the auto claims payments in U.S. dollars for 100,169 clients in Idaho. Based on the results, we conclude that the Tweedie model with CAR prior performs better than the one with GP prior on capturing the spatial pattern and the mass zeros property.

COinS