Date of Completion

11-25-2019

Embargo Period

5-23-2020

Keywords

Bayesian Network, Data Integration, Pathway Modeling

Major Advisor

Dong-Guk Shin

Associate Advisor

Lynn Kuo

Associate Advisor

Jinbo Bi

Associate Advisor

Sheida Nabavi

Associate Advisor

Charles Giardina

Field of Study

Computer Science and Engineering

Degree

Doctor of Philosophy

Open Access

Campus Access

Abstract

In this era of biomedical data, the current research uses genomics data (specifically, gene expression data) and compares it with the prior known gene regulation relationships, which are typically organized into curated molecular pathways, so as to gain a more accurate and interpretable result. However, pathway analysis research is still in its infancy. Our proposed approach encodes each pathway route as a Bayesian Network initialized with a sequence of conditional probabilities that are designed to incorporate directionality of regulatory relationships in the pathways, i.e. activation and inhibition relationships. The Dissertation includes the following sections.

1. Deep pathway analysis utilizing mutation and expression data. We propose a new way of analyzing biological pathways in which the analysis combines both transcriptome data and mutation information and uses the outcome to identify “routes” of aberrant pathways potentially responsible for the etiology of the disease.

2. Deep pathway analysis in algorithmic form and hyperparameter tuning.

We extend the first approach so that the given gene expression data can recognize which portion of sub-pathways are actively utilized in the biological system being studied without pre-defining the pathway routes.

3. Deep pathway analysis cooperating multi-omics data. In this chapter, we extend the pathway analysis method, which can combine multiple heterogeneous omics data types in its analysis.

4. Deep pathway builder with a recurrent neural network.

In this chapter, we construct a gene regulation network with the recurrent neural network (RNN), which is an approach helping people to extend existing pathways.

Summarily, we present a series of methods that aim to uncover pathway routes (as opposed to the whole pathway) as the unit of analysis to pinpoint perturbed signals from various omics data sets. we are optimistic that our methods can be applied to studying a wide array of biological systems. We also hope that the availability of our methods encourages wet-lab scientists to generate data sets that can be combined to derive a more accurate interpretation of biological systems.

COinS