Date of Completion
Mukul S. Bansal, Ion I. Mandoiu, Yufeng Wu
Field of Study
Computer Science and Engineering
Master of Science
Gene and sub-gene family evolution is usually represented in a framework where domain trees evolve inside one or more gene trees, each of which evolves inside a species tree. The Duplication-Transfer-Loss (DTL) reconciliation and the Domain-Gene-Species (DGS) reconciliation algorithms allow us to infer the evolutionary histories of a given set of species, genes, and protein domains. However, in the absence of biological data regarding the true evolutionary histories of these species, genes, and domains, we must rely on simulated data to validate the accuracy of these methods. Although numerous probabilistic simulation frameworks exist for gene family evolution, none of them account for certain important aspects of gene family evolution. Furthermore, no existing simulation framework can simulate sub-gene level events such as partial gene transfers and the evolution of domain families.
In this work, we modify an existing simulation framework to simulate both replacing and additive horizontal gene transfers, account for phylogenetic distance bias in choosing transfer recipients, and randomly select the location of gene birth in the species tree. In addition, we introduce the ability to simulate sub-gene level events such as partial gene transfers through the simulated evolution of protein domains within gene families.
To demonstrate the utility of our new simulation framework, we systematically evaluate the accuracy of DTL reconciliation on simulated datasets that contain both additive and replacing transfers. Our results from this simulation study indicate that DTL reconciliation, which assumes that all transfers are additive, is surprisingly robust to the presence of replacing transfers.
Kundu, Soumya, "An Improved Probabilistic Simulation Framework for Gene Family Evolution" (2018). Master's Theses. 1260.
Mukul S. Bansal