Search Content

A composite genome approach to identify phylogenetically informative data from next-generation sequencing

Description

Background
Improvements in sequencing technology now allow easy acquisition of large datasets; however, analyzing these data for phylogenetics can be challenging. We have developed a novel method to rapidly obtain homologous genomic data for phylogenetics directly from next-generation sequencing reads without the use of a reference genome. This software, called SISRS,…

Background
Improvements in sequencing technology now allow easy acquisition of large datasets; however, analyzing these data for phylogenetics can be challenging. We have developed a novel method to rapidly obtain homologous genomic data for phylogenetics directly from next-generation sequencing reads without the use of a reference genome. This software, called SISRS, avoids the time consuming steps of de novo whole genome assembly, multiple genome alignment, and annotation.
Results
For simulations SISRS is able to identify large numbers of loci containing variable sites with phylogenetic signal. For genomic data from apes, SISRS identified thousands of variable sites, from which we produced an accurate phylogeny. Finally, we used SISRS to identify phylogenetic markers that we used to estimate the phylogeny of placental mammals. We recovered eight phylogenies that resolved the basal relationships among mammals using datasets with different levels of missing data. The three alternate resolutions of the basal relationships are consistent with the major hypotheses for the relationships among mammals, all of which have been supported previously by different molecular datasets.
Conclusions
SISRS has the potential to transform phylogenetic research. This method eliminates the need for expensive marker development in many studies by using whole genome shotgun sequence data directly. SISRS is open source and freely available at https://github.com/rachelss/SISRS/releases.

ContributorsSchwartz, Rachel (Author) / Harkins, Kelly (Author) / Stone, Anne (Author) / Cartwright, Reed (Author) / Biodesign Institute (Contributor) / Center for Evolution and Medicine (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Human Evolution and Social Change (Contributor) / School of Life Sciences (Contributor)

Created2015-06-11

GRASP [Genomic Resource Access for Stoichioproteomics]: comparative explorations of the atomic content of 12 Drosophila proteomes

Description

Background
“Stoichioproteomics” relates the elemental composition of proteins and proteomes to variation in the physiological and ecological environment. To help harness and explore the wealth of hypotheses made possible under this framework, we introduce GRASP (http://www.graspdb.net), a public bioinformatic knowledgebase containing information on the frequencies of 20 amino acids and atomic…

Background
“Stoichioproteomics” relates the elemental composition of proteins and proteomes to variation in the physiological and ecological environment. To help harness and explore the wealth of hypotheses made possible under this framework, we introduce GRASP (http://www.graspdb.net), a public bioinformatic knowledgebase containing information on the frequencies of 20 amino acids and atomic composition of their side chains. GRASP integrates comparative protein composition data with annotation data from multiple public databases. Currently, GRASP includes information on proteins of 12 sequenced Drosophila (fruit fly) proteomes, which will be expanded to include increasingly diverse organisms over time. In this paper we illustrate the potential of GRASP for testing stoichioproteomic hypotheses by conducting an exploratory investigation into the composition of 12 Drosophila proteomes, testing the prediction that protein atomic content is associated with species ecology and with protein expression levels.
Results
Elements varied predictably along multivariate axes. Species were broadly similar, with the D. willistoni proteome a clear outlier. As expected, individual protein atomic content within proteomes was influenced by protein function and amino acid biochemistry. Evolution in elemental composition across the phylogeny followed less predictable patterns, but was associated with broad ecological variation in diet. Using expression data available for D. melanogaster, we found evidence consistent with selection for efficient usage of elements within the proteome: as expected, nitrogen content was reduced in highly expressed proteins in most tissues, most strongly in the gut, where nutrients are assimilated, and least strongly in the germline.
Conclusions
The patterns identified here using GRASP provide a foundation on which to base future research into the evolution of atomic composition in Drosophila and other taxa.

ContributorsGilbert, James D. J. (Author) / Acquisti, Claudia (Author) / Martinson, Holly M. (Author) / Elser, James (Author) / Kumar, Sudhir (Author) / Fagan, William F. (Author) / Biodesign Institute (Contributor) / Center for Evolution and Medicine (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor)

Created2013-09-04

Image-level and group-level models for Drosophila gene expression pattern annotation

Description

Background
Drosophila melanogaster has been established as a model organism for investigating the developmental gene interactions. The spatio-temporal gene expression patterns of Drosophila melanogaster can be visualized by in situ hybridization and documented as digital images. Automated and efficient tools for analyzing these expression images will provide biological insights into the…

Background
Drosophila melanogaster has been established as a model organism for investigating the developmental gene interactions. The spatio-temporal gene expression patterns of Drosophila melanogaster can be visualized by in situ hybridization and documented as digital images. Automated and efficient tools for analyzing these expression images will provide biological insights into the gene functions, interactions, and networks. To facilitate pattern recognition and comparison, many web-based resources have been created to conduct comparative analysis based on the body part keywords and the associated images. With the fast accumulation of images from high-throughput techniques, manual inspection of images will impose a serious impediment on the pace of biological discovery. It is thus imperative to design an automated system for efficient image annotation and comparison.
Results
We present a computational framework to perform anatomical keywords annotation for Drosophila gene expression images. The spatial sparse coding approach is used to represent local patches of images in comparison with the well-known bag-of-words (BoW) method. Three pooling functions including max pooling, average pooling and Sqrt (square root of mean squared statistics) pooling are employed to transform the sparse codes to image features. Based on the constructed features, we develop both an image-level scheme and a group-level scheme to tackle the key challenges in annotating Drosophila gene expression pattern images automatically. To deal with the imbalanced data distribution inherent in image annotation tasks, the undersampling method is applied together with majority vote. Results on Drosophila embryonic expression pattern images verify the efficacy of our approach.
Conclusion
In our experiment, the three pooling functions perform comparably well in feature dimension reduction. The undersampling with majority vote is shown to be effective in tackling the problem of imbalanced data. Moreover, combining sparse coding and image-level scheme leads to consistent performance improvement in keywords annotation.

ContributorsSun, Qian (Author) / Muckatira, Sherin (Author) / Yuan, Lei (Author) / Ji, Shuiwang (Author) / Newfeld, Stuart (Author) / Kumar, Sudhir (Author) / Ye, Jieping (Author) / Biodesign Institute (Contributor) / Center for Evolution and Medicine (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor) / Ira A. Fulton Schools of Engineering (Contributor)

Created2013-12-03

Learning Sparse Representations for Fruit-Fly Gene Expression Pattern Image Annotation and Retrieval

Description

Background
Fruit fly embryogenesis is one of the best understood animal development systems, and the spatiotemporal gene expression dynamics in this process are captured by digital images. Analysis of these high-throughput images will provide novel insights into the functions, interactions, and networks of animal genes governing development. To facilitate comparative analysis,…

Background
Fruit fly embryogenesis is one of the best understood animal development systems, and the spatiotemporal gene expression dynamics in this process are captured by digital images. Analysis of these high-throughput images will provide novel insights into the functions, interactions, and networks of animal genes governing development. To facilitate comparative analysis, web-based interfaces have been developed to conduct image retrieval based on body part keywords and images. Currently, the keyword annotation of spatiotemporal gene expression patterns is conducted manually. However, this manual practice does not scale with the continuously expanding collection of images. In addition, existing image retrieval systems based on the expression patterns may be made more accurate using keywords.
Results
In this article, we adapt advanced data mining and computer vision techniques to address the key challenges in annotating and retrieving fruit fly gene expression pattern images. To boost the performance of image annotation and retrieval, we propose representations integrating spatial information and sparse features, overcoming the limitations of prior schemes.
Conclusions
We perform systematic experimental studies to evaluate the proposed schemes in comparison with current methods. Experimental results indicate that the integration of spatial information and sparse features lead to consistent performance improvement in image annotation, while for the task of retrieval, sparse features alone yields better results.

ContributorsYuan, Lei (Author) / Woodard, Alexander (Author) / Ji, Shuiwang (Author) / Jiang, Yuan (Author) / Zhou, Zhi-Hua (Author) / Kumar, Sudhir (Author) / Ye, Jieping (Author) / Biodesign Institute (Contributor) / Center for Evolution and Medicine (Contributor) / Ira A. Fulton Schools of Engineering (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor)

Created2012-05-23

A mesh generation and machine learning framework for Drosophilagene expression pattern image analysis

Description

Background
Multicellular organisms consist of cells of many different types that are established during development. Each type of cell is characterized by the unique combination of expressed gene products as a result of spatiotemporal gene regulation. Currently, a fundamental challenge in regulatory biology is to elucidate the gene expression controls that…

Background
Multicellular organisms consist of cells of many different types that are established during development. Each type of cell is characterized by the unique combination of expressed gene products as a result of spatiotemporal gene regulation. Currently, a fundamental challenge in regulatory biology is to elucidate the gene expression controls that generate the complex body plans during development. Recent advances in high-throughput biotechnologies have generated spatiotemporal expression patterns for thousands of genes in the model organism fruit fly Drosophila melanogaster. Existing qualitative methods enhanced by a quantitative analysis based on computational tools we present in this paper would provide promising ways for addressing key scientific questions.
Results
We develop a set of computational methods and open source tools for identifying co-expressed embryonic domains and the associated genes simultaneously. To map the expression patterns of many genes into the same coordinate space and account for the embryonic shape variations, we develop a mesh generation method to deform a meshed generic ellipse to each individual embryo. We then develop a co-clustering formulation to cluster the genes and the mesh elements, thereby identifying co-expressed embryonic domains and the associated genes simultaneously. Experimental results indicate that the gene and mesh co-clusters can be correlated to key developmental events during the stages of embryogenesis we study. The open source software tool has been made available at http://compbio.cs.odu.edu/fly/.
Conclusions
Our mesh generation and machine learning methods and tools improve upon the flexibility, ease-of-use and accuracy of existing methods.

ContributorsZhang, Wenlu (Author) / Feng, Daming (Author) / Li, Rongjian (Author) / Chernikov, Andrey (Author) / Chrisochoides, Nikos (Author) / Osgood, Christopher (Author) / Konikoff, Charlotte (Author) / Newfeld, Stuart (Author) / Kumar, Sudhir (Author) / Ji, Shuiwang (Author) / Biodesign Institute (Contributor) / Center for Evolution and Medicine (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor)

Created2013-12-28

Rapid evolution of BRCA1 and BRCA2in humans and other primates

Description

Background
The maintenance of chromosomal integrity is an essential task of every living organism and cellular repair mechanisms exist to guard against insults to DNA. Given the importance of this process, it is expected that DNA repair proteins would be evolutionarily conserved, exhibiting very minimal sequence change over time. However, BRCA1,…

Background
The maintenance of chromosomal integrity is an essential task of every living organism and cellular repair mechanisms exist to guard against insults to DNA. Given the importance of this process, it is expected that DNA repair proteins would be evolutionarily conserved, exhibiting very minimal sequence change over time. However, BRCA1, an essential gene involved in DNA repair, has been reported to be evolving rapidly despite the fact that many protein-altering mutations within this gene convey a significantly elevated risk for breast and ovarian cancers.
Results
To obtain a deeper understanding of the evolutionary trajectory of BRCA1, we analyzed complete BRCA1 gene sequences from 23 primate species. We show that specific amino acid sites have experienced repeated selection for amino acid replacement over primate evolution. This selection has been focused specifically on humans and our closest living relatives, chimpanzees (Pan troglodytes) and bonobos (Pan paniscus). After examining BRCA1 polymorphisms in 7 bonobo, 44 chimpanzee, and 44 rhesus macaque (Macaca mulatta) individuals, we find considerable variation within each of these species and evidence for recent selection in chimpanzee populations. Finally, we also sequenced and analyzed BRCA2 from 24 primate species and find that this gene has also evolved under positive selection.
Conclusions
While mutations leading to truncated forms of BRCA1 are clearly linked to cancer phenotypes in humans, there is also an underlying selective pressure in favor of amino acid-altering substitutions in this gene. A hypothesis where viruses are the drivers of this natural selection is discussed.

ContributorsLou, Dianne I. (Author) / McBee, Ross M. (Author) / Le, Uyen Q. (Author) / Stone, Anne (Author) / Wilkerson, Gregory K. (Author) / Demogines, Ann M. (Author) / Sawyer, Sara L. (Author) / College of Liberal Arts and Sciences (Contributor) / School of Human Evolution and Social Change (Contributor) / School of Life Sciences (Contributor)

Created2014-07-11

Support for the reproductive ground plan hypothesis of social evolution and major QTL for ovary traits of Africanized worker honey bees (Apis mellifera L.)

Description

Background
The reproductive ground plan hypothesis of social evolution suggests that reproductive controls of a solitary ancestor have been co-opted during social evolution, facilitating the division of labor among social insect workers. Despite substantial empirical support, the generality of this hypothesis is not universally accepted. Thus, we investigated the prediction of…

Background
The reproductive ground plan hypothesis of social evolution suggests that reproductive controls of a solitary ancestor have been co-opted during social evolution, facilitating the division of labor among social insect workers. Despite substantial empirical support, the generality of this hypothesis is not universally accepted. Thus, we investigated the prediction of particular genes with pleiotropic effects on ovarian traits and social behavior in worker honey bees as a stringent test of the reproductive ground plan hypothesis. We complemented these tests with a comprehensive genome scan for additional quantitative trait loci (QTL) to gain a better understanding of the genetic architecture of the ovary size of honey bee workers, a morphological trait that is significant for understanding social insect caste evolution and general insect biology.
Results
Back-crossing hybrid European x Africanized honey bee queens to the Africanized parent colony generated two study populations with extraordinarily large worker ovaries. Despite the transgressive ovary phenotypes, several previously mapped QTL for social foraging behavior demonstrated ovary size effects, confirming the prediction of pleiotropic genetic effects on reproductive traits and social behavior. One major QTL for ovary size was detected in each backcross, along with several smaller effects and two QTL for ovary asymmetry. One of the main ovary size QTL coincided with a major QTL for ovary activation, explaining 3/4 of the phenotypic variance, although no simple positive correlation between ovary size and activation was observed.
Conclusions
Our results provide strong support for the reproductive ground plan hypothesis of evolution in study populations that are independent of the genetic stocks that originally led to the formulation of this hypothesis. As predicted, worker ovary size is genetically linked to multiple correlated traits of the complex division of labor in worker honey bees, known as the pollen hoarding syndrome. The genetic architecture of worker ovary size presumably consists of a combination of trait-specific loci and general regulators that affect the whole behavioral syndrome and may even play a role in caste determination. Several promising candidate genes in the QTL intervals await further study to clarify their potential role in social insect evolution and the regulation of insect fertility in general.

ContributorsGraham, Allie M. (Author) / Munday, Michael D. (Author) / Kaftanoglu, Osman (Author) / Page, Robert (Author) / Amdam, Gro (Author) / Rueppell, Olav (Author) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor)

Created2011-04-13

Bacterial Expression, Correct Membrane Targeting, and Functional Folding of the HIV-1 Membrane Protein Vpu Using a Periplasmic Signal Peptide

Description

Viral protein U (Vpu) is a type-III integral membrane protein encoded by Human Immunodeficiency Virus-1 (HIV- 1). It is expressed in infected host cells and plays several roles in viral progeny escape from infected cells, including down-regulation of CD4 receptors. But key structure/function questions remain regarding the mechanisms by which…

Viral protein U (Vpu) is a type-III integral membrane protein encoded by Human Immunodeficiency Virus-1 (HIV- 1). It is expressed in infected host cells and plays several roles in viral progeny escape from infected cells, including down-regulation of CD4 receptors. But key structure/function questions remain regarding the mechanisms by which the Vpu protein contributes to HIV-1 pathogenesis. Here we describe expression of Vpu in bacteria, its purification and characterization. We report the successful expression of PelB-Vpu in Escherichia coli using the leader peptide pectate lyase B (PelB) from Erwinia carotovora. The protein was detergent extractable and could be isolated in a very pure form. We demonstrate that the PelB signal peptide successfully targets Vpu to the cell membranes and inserts it as a type I membrane protein. PelB-Vpu was biophysically characterized by circular dichroism and dynamic light scattering experiments and was shown to be an excellent candidate for elucidating structural models.

ContributorsDeb, Arpan (Author) / Johnson, William (Author) / Kline, Alexander (Author) / Scott, Boston (Author) / Meador, Lydia (Author) / Srinivas, Dustin (Author) / Martin Garcia, Jose Manuel (Author) / Dorner, Katerina (Author) / Borges, Chad (Author) / Misra, Rajeev (Author) / Hogue, Brenda (Author) / Fromme, Petra (Author) / Mor, Tsafrir (Author) / ASU Biodesign Center Immunotherapy, Vaccines and Virotherapy (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor) / Biodesign Institute (Contributor) / School of Molecular Sciences (Contributor) / Applied Structural Discovery (Contributor) / Personalized Diagnostics (Contributor)

Created2017-02-22

Effect of woody-plant encroachment on livestock production in North and South America

Description

A large fraction of the world grasslands and savannas are undergoing a rapid shift from herbaceous to woody-plant dominance. This land-cover change is expected to lead to a loss in livestock production (LP), but the impacts of woody-plant encroachment on this crucial ecosystem service have not been assessed. We evaluate…

A large fraction of the world grasslands and savannas are undergoing a rapid shift from herbaceous to woody-plant dominance. This land-cover change is expected to lead to a loss in livestock production (LP), but the impacts of woody-plant encroachment on this crucial ecosystem service have not been assessed. We evaluate how tree cover (TC) has affected LP at large spatial scales in rangelands of contrasting social–economic characteristics in the United States and Argentina. Our models indicate that in areas of high productivity, a 1% increase in TC results in a reduction in LP ranging from 0.6 to 1.6 reproductive cows (Rc) per km[superscript 2]. Mean LP in the United States is 27 Rc per km[superscript 2], so a 1% increase in TC results in a 2.5% decrease in mean LP. This effect is large considering that woody-plant cover has been described as increasing at 0.5% to 2% per y. On the contrary, in areas of low productivity, increased TC had a positive effect on LP. Our results also show that ecological factors account for a larger fraction of LP variability in Argentinean than in US rangelands. Differences in the relative importance of ecological versus nonecological drivers of LP in Argentina and the United States suggest that the valuation of ecosystem services between these two rangelands might be different. Current management strategies in Argentina are likely designed to maximize LP for various reasons we are unable to explore in this effort, whereas land managers in the United States may be optimizing multiple ecosystem services, including conservation or recreation, alongside LP.

ContributorsAnadon, Jose Daniel (Author) / Sala, Osvaldo (Author) / Turner II, B. L. (Author) / Bennett, Elena M. (Author) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor) / Julie Ann Wrigley Global Institute of Sustainability (Contributor) / School of Sustainability (Contributor) / School of Geographical Sciences and Urban Planning (Contributor)

Created2014-09-02

A context-dependent alarm signal in the ant Temnothorax rugatulus

Description

Because collective cognition emerges from local signaling among group members, deciphering communication systems is crucial to understanding the underlying mechanisms. Alarm signals are widespread in the social insects and can elicit a variety of behavioral responses to danger, but the functional plasticity of these signals has not been well studied.…

Because collective cognition emerges from local signaling among group members, deciphering communication systems is crucial to understanding the underlying mechanisms. Alarm signals are widespread in the social insects and can elicit a variety of behavioral responses to danger, but the functional plasticity of these signals has not been well studied. Here we report an alarm pheromone in the ant Temnothorax rugatulus that elicits two different behaviors depending on context. When an ant was tethered inside an unfamiliar nest site and unable to move freely, she released a pheromone from her mandibular gland that signaled other ants to reject this nest as a potential new home, presumably to avoid potential danger. When the same pheromone was presented near the ants' home nest, they were instead attracted to it, presumably to respond to a threat to the colony. We used coupled gas chromatography/mass spectrometry to identify candidate compounds from the mandibular gland and tested each one in a nest choice bioassay. We found that 2,5-dimethylpyrazine was sufficient to induce rejection of a marked new nest and also to attract ants when released at the home nest. This is the first detailed investigation of chemical communication in the leptothoracine ants. We discuss the possibility that this pheromone's deterrent function can improve an emigrating colony's nest site selection performance.

ContributorsSasaki, Takao (Author) / Hoelldobler, Bert (Author) / Millar, Jocelyn G. (Author) / Pratt, Stephen (Author) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor) / ASU-SFI Center for Biosocial Complex Systems (Contributor) / Center for Social Dynamics and Complexity (Contributor)

Created2014-09-01

Filtering by