Date of Completion
computational biology, bioinformatics, scaffolding, genome assembly, biomarker selection, deconvolution
Field of Study
Computer Science and Engineering
Doctor of Philosophy
The problem of interpreting biological data is often cast into a mathematical optimization framework where a large body of existing computational theory and practical techniques can be leveraged. While this strategy has been particularly successful in the bioinformatics domain, the massive datasets generated by high-throughput genomic technologies are challenging the scalability of even the most advanced mathematical optimization algorithms. Indeed, as the cost per base of of DNA sequencing has dropped precipitously, even outpacing Moore's law, the size of many bioinformatics problems has grown beyond the limit of existing methods, necessitating new algorithms. This effect is felt even more acutely in the burgeoning field of single cell biology where advances in microfluidics has rapidly increased the ability of bench biologists to capture and sequence the genomes and transcriptomes of hundreds of cells per experiment.
This dissertation presents novel computational method for answering three distinct biological questions: genome scaffolding, biomarker selection, and computational deconvolution of gene expression data from heterogeneous samples assisted by single-cell expression data. Each method strives to balance computational efficiency with the biological relevance of computed solutions.
Lindsay, James, "Scalable Optimization Algorithms for High-throughput Genomic Data" (2015). Doctoral Dissertations. 754.