Date of Completion

7-17-2018

Embargo Period

7-16-2018

Keywords

Bioinformatics, Computational Biology, Evolutionary Genomics, Algorithms, Graphs, Trees

Major Advisor

Yufeng Wu

Associate Advisor

Sanguthevar Rajasekaran

Associate Advisor

Ion Mandoiu

Associate Advisor

Mukul Bansal

Field of Study

Computer Science and Engineering

Degree

Doctor of Philosophy

Open Access

Open Access

Abstract

Phylogenetic networks are graphs used to abstractly or explicitly visualize the evolutionary relationships between genes, genomes, species, nucleotide sequences, chromosomes, etc. Reticulation events such as hybridization, horizontal gene transfer, recombination, population admixture, gene duplication, etc. are shown in such networks. Phylogenetic trees are a subset of phylogenetic networks used in the absence of such events.

In this dissertation, we focus on some of the existing problems in phylogenetics.

First, we propose a heuristic method called PIRN_S to build near optimal so-called ``hybridization networks" from a given set of phylogenetic trees (called gene trees), representing evolutionary history, such that trees are ``displayed" in the network. This method is more efficient for large numbers of gene trees than previous heuristics. This method also produces more parsimonious results on many simulated datasets as well as a real biological dataset than a previous method.

Second, we present a new approach called RENT+ for the inference of local genealogical trees from haplotypes with the presence of recombination. RENT+ builds on a previous genealogy inference approach called RENT, which infers a set of related genealogical trees at different genomic positions. RENT+ represents a significant improvement over RENT in the sense that it is more effective in extracting information contained in the haplotype data about the underlying genealogy than RENT. The key components of RENT+ are several greatly enhanced genealogy inference rules. Through simulation, we show that RENT+ is more efficient and accurate than several existing genealogy inference methods. As an application, we apply RENT+ in the inference of population demographic history from haplotypes, which outperforms several existing methods.

Finally, we introduced a method called PopMix which uses RENT+ and the approach used in PIRN_S, to infer population demographic histories including admixture events from given population haplotypes. Through simulation, we show that PopMix infers more accurate admixture networks that an existing method, by using the information in the underlying relations among nearby SNPs.

COinS