Achievement
Identification of protein-coding regions
Project
IGERT Plant System Biology Interdisciplinary Graduate Training Program
University
University of California at San Diego
(La Jolla, CA)
Research Achievements
Identification of protein-coding regions
The identification of all protein-coding elements in a genome is a fundamental goal of gene annotation. In previous work, IGERT student Natalie Castellana demonstrated that tandem mass spectrometry (MS) can be used to identify novel protein-coding regions in genomes and improve gene annotation. Recently, the methods for identifying the novel coding regions using MS, and producing a refined gene annotation were developed into a fully automated pipeline that runs on a computer cluster. The pipeline is available via a web interface, and is currently being used for an annotation project in Zea mays. So far Natalie Castellana and her colleagues have confirmed 17,579 known proteins and identified 8,099 peptides in genome leading to the discovery of 90 novel protein-coding genes and over 700 other gene refinements. The entire annotation pipeline is generalizable to any organism. This work would not be possible without the IGERT support.
- “Research Achievements”
- Achievements for this Project