Files
Abstract
Gene annotation is a critical step for understanding functions of genomes and host organisms. However, accurately annotating thousands of sequenced genomes can be a challenging task. Further, although gene transcriptional regulation is crucial for many important biological functions of cells, our understanding of the complexity of this process in even prokaryotes is still limited. This dissertation addressed both of these problems. First, we developed a fast and scalable tool, PorthoMCL for annotating gene functions through identifying orthologous genes in a large number of prokaryotic genomes. Using this tool, we have predicted orthologous genes in the thousands of sequenced prokaryotic genomes for public use. Second, we systematically investigated the complexity of transcriptomes in E. coli K12 in response to a variety of environmental changes. By adopting a model for alternate splicing isoforms in eukaryotes, we revealed that ~22% of operons exhibited different forms of transcriptional units. i.e., alternative operon utilizations, and ~36% operons displayed varying transcriptional levels of their genes, i.e., dynamic operon utilizations, at different growth phases and culture conditions. Moreover, by simultaneously profiling directional transcriptomes and proteomes of E. coli K12 cells, we found that a varying portion of genes had antisense RNA (asRNAs) transcription in a growth phase- and culture condition-dependent manner. The detected asRNAs were generally short and overlapped the previously identified asRNAs. Intriguingly, the correlation between genes’ protein levels and mRNA levels was disrupted by increased relative expression levels of asRNA to mRNA, suggesting that asRNA may play an important role in gene expressional regulation.