Sequencing and annotation of diploid oat genomes and the investigation of Avena-specific nutrients
Cultivated hexaploid oat (Avena sativa) has held a significant place within the global crop community for centuries; although its cultivation has decreased over the past century, its nutritional benefits have garnered increased interest for human consumption. This dissertation reports the development of fully annotated, chromosome-scale assemblies for the extant progenitor species of the As- and Cp-subgenomes, Avena atlantica and Avena eriantha respectively. The diploid Avena species serve as important genetic resources for improving common oat’s adaptive and food quality characteristics.The A. atlantica and A. eriantha genome assemblies span 3.69 and 3.78 Gb with an N50 of 513 and 535 Mb, respectively. Annotation of the genomes, using sequenced transcriptomes, identified ~50,000 gene models in each species – including 2,965 resistance gene analogs across both species. Analysis of these assemblies classified much of each genome as repetitive sequence (~83%), including species specific repeats, centromeric-specific and telomeric-specific repeats. LTR retrotransposons make up most of the classified elements. Genome-wide syntenic comparisons with other members of the Pooideae revealed orthologous relationships, while comparisons with genetic maps from common oat clarified subgenome origins for each of the 21 hexaploid linkage groups. The utility of the diploid genomes was demonstrated by identifying putative candidate genes for flowering time (HD3A) and crown rust resistance (Pc91). We also investigate the phylogenetic relationships among other A- and C- genome Avena species.The genomes reported here are the first chromosome scale assemblies reported for the tribe Poeae, subtribe Aveninae. Our analyses provide important insight into the evolution and complexity of common hexaploid oat, including subgenome origin, homoeologous relationships, and major intra- and intergenomic rearrangements. They also provide the annotation framework needed to accelerate gene discovery and plant breeding.A pipeline was developed to identify species-specific genes and has been applied to A. atlantica and A. eriantha. Various BLAST algorithms were used to compare gene sets from the Phytozome database, GenBank’s nonredundant database, and the two Avena species. A custom Python script was written to parse the output of these analyses. This pipeline has identified 2,511 and 3,043 A. atlantica- and A. eriantha-specific gene models, respectively, from approximately 50,000 each. A domain search was performed on these gene models as a first step in identifying possible functions for these genes. Domains identified in both gene sets include metallothionein family 15, members of which include genes to phytoextract metals from soil and aid in stress and cold response, and eggshell protein signatures, which are found in glycine-rich cell wall structural proteins. Finally, the relationship between oat and various human diseases was studied using the P2EP Knowledge Base. This study identified several relationships between oat consumption and human pathways that require further investigation, including the HSD11B1, RANKL, PARK7, mTOR, ARID4B, and KMT5C genes. These genes all appear to be affected by oat consumption, but the details of those relationships remain unknown. Further understanding of these relationships could guide the prevention and treatment of heart conditions, diabetes, dermatitises, and cancer.