Using phylogenetically-informed annotation (PIA) to search for light-interacting genes in transcriptomes from non-model organisms

Show full item record

Title: Using phylogenetically-informed annotation (PIA) to search for light-interacting genes in transcriptomes from non-model organisms
Author: Speiser, Daniel I; Pankey, M S; Zaharoff, Alexander K; Battelle, Barbara A; Bracken-Grissom, Heather D; Breinholt, Jesse W; Bybee, Seth M; Cronin, Thomas W; Garm, Anders; Lindgren, Annie R; Patel, Nipam H; Porter, Megan L; Protas, Meredith E; Rivera, Ajna S; Serb, Jeanne M; Zigler, Kirk S; Crandall, Keith A; Oakley, Todd H
Abstract: Abstract Background Tools for high throughput sequencing and de novo assembly make the analysis of transcriptomes (i.e. the suite of genes expressed in a tissue) feasible for almost any organism. Yet a challenge for biologists is that it can be difficult to assign identities to gene sequences, especially from non-model organisms. Phylogenetic analyses are one useful method for assigning identities to these sequences, but such methods tend to be time-consuming because of the need to re-calculate trees for every gene of interest and each time a new data set is analyzed. In response, we employed existing tools for phylogenetic analysis to produce a computationally efficient, tree-based approach for annotating transcriptomes or new genomes that we term Phylogenetically-Informed Annotation (PIA), which places uncharacterized genes into pre-calculated phylogenies of gene families. Results We generated maximum likelihood trees for 109 genes from a Light Interaction Toolkit (LIT), a collection of genes that underlie the function or development of light-interacting structures in metazoans. To do so, we searched protein sequences predicted from 29 fully-sequenced genomes and built trees using tools for phylogenetic analysis in the Osiris package of Galaxy (an open-source workflow management system). Next, to rapidly annotate transcriptomes from organisms that lack sequenced genomes, we repurposed a maximum likelihood-based Evolutionary Placement Algorithm (implemented in RAxML) to place sequences of potential LIT genes on to our pre-calculated gene trees. Finally, we implemented PIA in Galaxy and used it to search for LIT genes in 28 newly-sequenced transcriptomes from the light-interacting tissues of a range of cephalopod mollusks, arthropods, and cubozoan cnidarians. Our new trees for LIT genes are available on the Bitbucket public repository ( and we demonstrate PIA on a publicly-accessible web server ( Conclusions Our new trees for LIT genes will be a valuable resource for researchers studying the evolution of eyes or other light-interacting structures. We also introduce PIA, a high throughput method for using phylogenetic relationships to identify LIT genes in transcriptomes from non-model organisms. With simple modifications, our methods may be used to search for different sets of genes or to annotate data sets from taxa outside of Metazoa.
Date: 2014-11-19

Files in this item

Files Size Format View
s12859-014-0350-x.xml 204.7Kb XML View/Open
s12859-014-0350-x.pdf 924.9Kb PDF Thumbnail
s12859-014-0350-x-S1.xlsx 19.93Kb Microsoft Excel 2007 View/Open
s12859-014-0350-x-S2.pdf 489.6Kb PDF Thumbnail
s12859-014-0350-x-S3.xlsx 12.36Kb Microsoft Excel 2007 View/Open
s12859-014-0350-x-S4.xlsx 11.96Kb Microsoft Excel 2007 View/Open
s12859-014-0350-x-S5.docx 15.88Kb Microsoft Word 2007 View/Open

This item appears in the following Collection(s)

Show full item record

Search UWISpace

Advanced Search


My Account