Date of Completion

4-23-2019

Embargo Period

4-23-2019

Keywords

Gene family evolution, gene-tree/species-tree reconciliation, non-binary trees, fixed-parameter tractable

Major Advisor

Mukul Bansal

Associate Advisor

Ion Mandoiu

Associate Advisor

Sheida Nabavi

Associate Advisor

Yufeng Wu

Field of Study

Computer Science and Engineering

Degree

Doctor of Philosophy

Open Access

Open Access

Abstract

Gene families evolve through complex evolutionary events such as speciation, gene duplication, horizontal gene transfer, gene loss, etc., and reconstructing these evolutionary histories is an important problem in evolutionary biology with many important applications. Duplication-Transfer-Loss (DTL) reconciliation is among the most effective and most popular methods for studying gene family evolution, especially in microbes. DTL reconciliation takes as input a gene tree and a species tree and reconciles the two by postulating gene duplication, transfer, and loss events, showing the evolution of that gene family inside the species tree. The DTL reconciliation problem has been extensively studied, but existing problem formulations and algorithms have several limitations that affect the accuracy and applicability of DTL reconciliation in practice. In this thesis, we focus on addressing two of the most important limitations. The first limitation is that existing algorithms assume a fixed, binary gene tree topology and therefore cannot account for uncertainty in gene tree topologies, a common occurrence in practice. The second limitation is that all transfer events are assumed to be “additive”, i.e., they introduce a new gene into the recipient genome. It is well known, however, that transfer events can also be “replacing”, i.e., they can replace an existing gene in the recipient genome. To address the first limitation, we devise an extension of DTL reconciliation to non-binary gene trees and show that the resulting problem is NP-hard. We then provide fixed parameter and other exact and heuristic algorithms for this problem, and demonstrate their impact in practice on real and simulated data. For the second limitation, we propose and develop a new extended reconciliation framework, called the DTRL reconciliation framework, which models both additive and replacing transfers, and show that the resulting computational problem is NP-hard.

COinS