Date of Completion

7-12-2018

Embargo Period

7-12-2018

Keywords

Oxford Nanopore Technology, Splicing, long-read sequencing, Dscam1, SIRV spike-in, direct RNA sequencing, third-generation sequencing, MinION sequencing, exon connectivity, full-length isoform abundance, combinatorial RNA editing, Ndae1 RNA editing, SK RNA editing

Major Advisor

Brenton R. Graveley

Associate Advisor

Gordon G. Carmichael

Associate Advisor

Blanka Rogina

Associate Advisor

Jeffrey Chuang

Associate Advisor

Zhengqing Ouyang

Field of Study

Biomedical Science

Degree

Doctor of Philosophy

Open Access

Open Access

Abstract

The central dogma states that the genetic information contained in DNA flows to RNA through the process of transcription which, in turn, can result in protein synthesis through translation. Alternative splicing is a mechanism by which multiple mRNA isoforms are generated from a single gene. Ultracomplex genes, characterized by their ability to encode hundreds to thousands of isoforms, arise from a combination of multiple splicing events. Our understanding of alternative splicing improved vastly in the past decade due to the advent of next-generation sequencing (NGS) technologies. The NGS technologies are powerful and have enabled scientists to measure the expression of genes and isoforms digitally, assemble genomes, reconstruct transcriptomes and clinicians to cater treatments that are specific to an individual’s genetic makeup. While NGS technologies have many strengths, the shorter read lengths generated from these platforms limit their ability to study exon connectivity over long distances and this information is often inferred through statistical means rather than direct measurement. Additionally, the repetitive regions in the genome represents a special case where the short reads have inherent difficulty in joining two adjacent different contigs into a scaffold. The third-generation sequencing technologies, characterized by their ability to generate ultra-long reads can be used to address these limitations.

Here, I have used the Oxford Nanopore (ONT) MinION device to first demonstrate the utility of nanopore technology to sequence long reads to identify exon connectivity using the Drosophila Rdl, MRP, Mhc and Dscam1 genes. I extended this approach to sequence full-length cDNAs generated from SIRV spike-in RNA to determine the quantitative ability of the platform. These experiments demonstrate the ability of ONT platform to deconvolute isoforms and by sequencing Drosophila ultracomplex genes, I also show that ONT can identify previously unannotated exons and RNA editing sites over long distances. By using direct RNA sequencing, I demonstrate the ability to sequence full-length Eno2 RNA molecules and that a majority of the reads were sequenced full-length.

COinS