Search Content

Image-level and group-level models for Drosophila gene expression pattern annotation

Description

Background
Drosophila melanogaster has been established as a model organism for investigating the developmental gene interactions. The spatio-temporal gene expression patterns of Drosophila melanogaster can be visualized by in situ hybridization and documented as digital images. Automated and efficient tools for analyzing these expression images will provide biological insights into the…

Background
Drosophila melanogaster has been established as a model organism for investigating the developmental gene interactions. The spatio-temporal gene expression patterns of Drosophila melanogaster can be visualized by in situ hybridization and documented as digital images. Automated and efficient tools for analyzing these expression images will provide biological insights into the gene functions, interactions, and networks. To facilitate pattern recognition and comparison, many web-based resources have been created to conduct comparative analysis based on the body part keywords and the associated images. With the fast accumulation of images from high-throughput techniques, manual inspection of images will impose a serious impediment on the pace of biological discovery. It is thus imperative to design an automated system for efficient image annotation and comparison.
Results
We present a computational framework to perform anatomical keywords annotation for Drosophila gene expression images. The spatial sparse coding approach is used to represent local patches of images in comparison with the well-known bag-of-words (BoW) method. Three pooling functions including max pooling, average pooling and Sqrt (square root of mean squared statistics) pooling are employed to transform the sparse codes to image features. Based on the constructed features, we develop both an image-level scheme and a group-level scheme to tackle the key challenges in annotating Drosophila gene expression pattern images automatically. To deal with the imbalanced data distribution inherent in image annotation tasks, the undersampling method is applied together with majority vote. Results on Drosophila embryonic expression pattern images verify the efficacy of our approach.
Conclusion
In our experiment, the three pooling functions perform comparably well in feature dimension reduction. The undersampling with majority vote is shown to be effective in tackling the problem of imbalanced data. Moreover, combining sparse coding and image-level scheme leads to consistent performance improvement in keywords annotation.

ContributorsSun, Qian (Author) / Muckatira, Sherin (Author) / Yuan, Lei (Author) / Ji, Shuiwang (Author) / Newfeld, Stuart (Author) / Kumar, Sudhir (Author) / Ye, Jieping (Author) / Biodesign Institute (Contributor) / Center for Evolution and Medicine (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor) / Ira A. Fulton Schools of Engineering (Contributor)

Created2013-12-03

GRASP [Genomic Resource Access for Stoichioproteomics]: comparative explorations of the atomic content of 12 Drosophila proteomes

Description

Background
“Stoichioproteomics” relates the elemental composition of proteins and proteomes to variation in the physiological and ecological environment. To help harness and explore the wealth of hypotheses made possible under this framework, we introduce GRASP (http://www.graspdb.net), a public bioinformatic knowledgebase containing information on the frequencies of 20 amino acids and atomic…

Background
“Stoichioproteomics” relates the elemental composition of proteins and proteomes to variation in the physiological and ecological environment. To help harness and explore the wealth of hypotheses made possible under this framework, we introduce GRASP (http://www.graspdb.net), a public bioinformatic knowledgebase containing information on the frequencies of 20 amino acids and atomic composition of their side chains. GRASP integrates comparative protein composition data with annotation data from multiple public databases. Currently, GRASP includes information on proteins of 12 sequenced Drosophila (fruit fly) proteomes, which will be expanded to include increasingly diverse organisms over time. In this paper we illustrate the potential of GRASP for testing stoichioproteomic hypotheses by conducting an exploratory investigation into the composition of 12 Drosophila proteomes, testing the prediction that protein atomic content is associated with species ecology and with protein expression levels.
Results
Elements varied predictably along multivariate axes. Species were broadly similar, with the D. willistoni proteome a clear outlier. As expected, individual protein atomic content within proteomes was influenced by protein function and amino acid biochemistry. Evolution in elemental composition across the phylogeny followed less predictable patterns, but was associated with broad ecological variation in diet. Using expression data available for D. melanogaster, we found evidence consistent with selection for efficient usage of elements within the proteome: as expected, nitrogen content was reduced in highly expressed proteins in most tissues, most strongly in the gut, where nutrients are assimilated, and least strongly in the germline.
Conclusions
The patterns identified here using GRASP provide a foundation on which to base future research into the evolution of atomic composition in Drosophila and other taxa.

ContributorsGilbert, James D. J. (Author) / Acquisti, Claudia (Author) / Martinson, Holly M. (Author) / Elser, James (Author) / Kumar, Sudhir (Author) / Fagan, William F. (Author) / Biodesign Institute (Contributor) / Center for Evolution and Medicine (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor)

Created2013-09-04

A Bag-of-Words Approach for Drosophila Gene Expression Pattern Annotation

Description

Background:
Drosophila gene expression pattern images document the spatiotemporal dynamics of gene expression during embryogenesis. A comparative analysis of these images could provide a fundamentally important way for studying the regulatory networks governing development. To facilitate pattern comparison and searching, groups of images in the Berkeley Drosophila Genome Project (BDGP) high-throughput…

Background:
Drosophila gene expression pattern images document the spatiotemporal dynamics of gene expression during embryogenesis. A comparative analysis of these images could provide a fundamentally important way for studying the regulatory networks governing development. To facilitate pattern comparison and searching, groups of images in the Berkeley Drosophila Genome Project (BDGP) high-throughput study were annotated with a variable number of anatomical terms manually using a controlled vocabulary. Considering that the number of available images is rapidly increasing, it is imperative to design computational methods to automate this task.

Results:
We present a computational method to annotate gene expression pattern images automatically. The proposed method uses the bag-of-words scheme to utilize the existing information on pattern annotation and annotates images using a model that exploits correlations among terms. The proposed method can annotate images individually or in groups (e.g., according to the developmental stage). In addition, the proposed method can integrate information from different two-dimensional views of embryos. Results on embryonic patterns from BDGP data demonstrate that our method significantly outperforms other methods.

Conclusion:
The proposed bag-of-words scheme is effective in representing a set of annotations assigned to a group of images, and the model employed to annotate images successfully captures the correlations among different controlled vocabulary terms. The integration of existing annotation information from multiple embryonic views improves annotation performance.

ContributorsJi, Shuiwang (Author) / Li, Ying-Xin (Author) / Zhou, Zhi-Hua (Author) / Kumar, Sudhir (Author) / Ye, Jieping (Author) / Biodesign Institute (Contributor) / Ira A. Fulton Schools of Engineering (Contributor) / School of Electrical, Computer and Energy Engineering (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor)

Created2009-04-21

Exceptional reduction of the plastid genome of saguaro cactus (Carnegiea gigantea): Loss of the ndh gene suite and inverted repeat

Description

Premise of the study: Land-plant plastid genomes have only rarely undergone significant changes in gene content and order. Thus, discovery of additional examples adds power to tests for causes of such genome-scale structural changes.
Methods: Using next-generation sequence data, we assembled the plastid genome of saguaro cactus and probed the nuclear…

Premise of the study: Land-plant plastid genomes have only rarely undergone significant changes in gene content and order. Thus, discovery of additional examples adds power to tests for causes of such genome-scale structural changes.
Methods: Using next-generation sequence data, we assembled the plastid genome of saguaro cactus and probed the nuclear genome for transferred plastid genes and functionally related nuclear genes. We combined these results with available data across Cactaceae and seed plants more broadly to infer the history of gene loss and to assess the strength of phylogenetic association between gene loss and loss of the inverted repeat (IR).
Key results: The saguaro plastid genome is the smallest known for an obligately photosynthetic angiosperm (∼113 kb), having lost the IR and plastid ndh genes. This loss supports a statistically strong association across seed plants between the loss of ndh genes and the loss of the IR. Many nonplastid copies of plastid ndh genes were found in the nuclear genome, but none had intact reading frames; nor did three related nuclear-encoded subunits. However, nuclear pgr5, which functions in a partially redundant pathway, was intact.
Conclusions: The existence of an alternative pathway redundant with the function of the plastid NADH dehydrogenase-like complex (NDH) complex may permit loss of the plastid ndh gene suite in photoautotrophs like saguaro. Loss of these genes may be a recurring mechanism for overall plastid genome size reduction, especially in combination with loss of the IR.

ContributorsSanderson, Michael J. (Author) / Copetti, Dario (Author) / Burquez, Alberto (Author) / Bustamante, Enriquena (Author) / Charboneau, Joseph L. M. (Author) / Eguiarte, Luis E. (Author) / Kumar, Sudhir (Author) / Lee, Hyun Oh (Author) / Lee, Junki (Author) / McMahon, Michelle (Author) / Steele, Kelly (Author) / Wing, Rod (Author) / Yang, Tae-Jin (Author) / Zwickl, Derrick (Author) / Wojciechowski, Martin (Author) / College of Integrative Sciences and Arts (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor)

Created2015-07-01

Evolutionary Diagnosis of Non-Synonymous Variants Involved in Differential Drug Response

Description

Background:
Many pharmaceutical drugs are known to be ineffective or have negative side effects in a substantial proportion of patients. Genomic advances are revealing that some non-synonymous single nucleotide variants (nsSNVs) may cause differences in drug efficacy and side effects. Therefore, it is desirable to evaluate nsSNVs of interest in their…

Background:
Many pharmaceutical drugs are known to be ineffective or have negative side effects in a substantial proportion of patients. Genomic advances are revealing that some non-synonymous single nucleotide variants (nsSNVs) may cause differences in drug efficacy and side effects. Therefore, it is desirable to evaluate nsSNVs of interest in their ability to modulate the drug response.

Results:
We found that the available data on the link between drug response and nsSNV is rather modest. There were only 31 distinct drug response-altering (DR-altering) and 43 distinct drug response-neutral (DR-neutral) nsSNVs in the whole Pharmacogenomics Knowledge Base (PharmGKB). However, even with this modest dataset, it was clear that existing bioinformatics tools have difficulties in correctly predicting the known DR-altering and DR-neutral nsSNVs. They exhibited an overall accuracy of less than 50%, which was not better than random diagnosis. We found that the underlying problem is the markedly different evolutionary properties between positions harboring nsSNVs linked to drug responses and those observed for inherited diseases. To solve this problem, we developed a new diagnosis method, Drug-EvoD, which was trained on the evolutionary properties of nsSNVs associated with drug responses in a sparse learning framework. Drug-EvoD achieves a TPR of 84% and a TNR of 53%, with a balanced accuracy of 69%, which improves upon other methods significantly.

Conclusions:
The new tool will enable researchers to computationally identify nsSNVs that may affect drug responses. However, much larger training and testing datasets are needed to develop more reliable and accurate tools.

ContributorsGerek, Nevin Z. (Author) / Liu, Li (Author) / Gerold, Kristyn (Author) / Biparva, Pegah (Author) / Thomas, Eric D. (Author) / Kumar, Sudhir (Author) / Biodesign Institute (Contributor) / Center for Evolution and Medicine (Contributor)

Created2015-01-15

Conductance fluctuations in high mobility monolayer graphene: Nonergodicity, lack of determinism and chaotic behavior

Description

We have fabricated a high mobility device, composed of a monolayer graphene flake sandwiched between two sheets of hexagonal boron nitride. Conductance fluctuations as functions of a back gate voltage and magnetic field were obtained to check for ergodicity. Non-linear dynamics concepts were used to study the nature of these…

We have fabricated a high mobility device, composed of a monolayer graphene flake sandwiched between two sheets of hexagonal boron nitride. Conductance fluctuations as functions of a back gate voltage and magnetic field were obtained to check for ergodicity. Non-linear dynamics concepts were used to study the nature of these fluctuations. The distribution of eigenvalues was estimated from the conductance fluctuations with Gaussian kernels and it indicates that the carrier motion is chaotic at low temperatures. We argue that a two-phase dynamical fluid model best describes the transport in this system and can be used to explain the violation of the so-called ergodic hypothesis found in graphene.

Contributorsda Cunha, C. R. (Author) / Mineharu, M. (Author) / Matsunaga, M. (Author) / Matsumoto, N. (Author) / Chuang, C. (Author) / Ochiai, Y. (Author) / Kim, G.-H. (Author) / Watanabe, K. (Author) / Taniguchi, T. (Author) / Ferry, David (Author) / Aoki, N. (Author) / Ira A. Fulton Schools of Engineering (Contributor) / School of Electrical, Computer and Energy Engineering (Contributor)

Created2016-09-09

Introduction: The Continued Importance of Smallholders Today

Description

Evolving Earth observation and change detection techniques enable the automatic identification of Land Use and Land Cover Change (LULCC) over a large extent from massive amounts of remote sensing data. It at the same time poses a major challenge in effective organization, representation and modeling of such information. This study…

Evolving Earth observation and change detection techniques enable the automatic identification of Land Use and Land Cover Change (LULCC) over a large extent from massive amounts of remote sensing data. It at the same time poses a major challenge in effective organization, representation and modeling of such information. This study proposes and implements an integrated computational framework to support the modeling, semantic and spatial reasoning of change information with regard to space, time and topology. We first proposed a conceptual model to formally represent the spatiotemporal variation of change data, which is essential knowledge to support various environmental and social studies, such as deforestation and urbanization studies. Then, a spatial ontology was created to encode these semantic spatiotemporal data in a machine-understandable format. Based on the knowledge defined in the ontology and related reasoning rules, a semantic platform was developed to support the semantic query and change trajectory reasoning of areas with LULCC. This semantic platform is innovative, as it integrates semantic and spatial reasoning into a coherent computational and operational software framework to support automated semantic analysis of time series data that can go beyond LULC datasets. In addition, this system scales well as the amount of data increases, validated by a number of experimental results. This work contributes significantly to both the geospatial Semantic Web and GIScience communities in terms of the establishment of the (web-based) semantic platform for collaborative question answering and decision-making.

ContributorsVadjunec, Jacqueline M. (Author) / Radel, Claudia (Author) / Turner II, B. L. (Author) / College of Liberal Arts and Sciences (Contributor) / School of Geographical Sciences and Urban Planning (Contributor) / Julie Ann Wrigley Global Institute of Sustainability (Contributor) / School of Sustainability (Contributor)

Created2016-10-25

Testing the Growth Rate Hypothesis in Vascular Plants with Above- and Below-Ground Biomass

Description

The growth rate hypothesis (GRH) proposes that higher growth rate (the rate of change in biomass per unit biomass, μ) is associated with higher P concentration and lower C∶P and N∶P ratios. However, the applicability of the GRH to vascular plants is not well-studied and few studies have been done…

The growth rate hypothesis (GRH) proposes that higher growth rate (the rate of change in biomass per unit biomass, μ) is associated with higher P concentration and lower C∶P and N∶P ratios. However, the applicability of the GRH to vascular plants is not well-studied and few studies have been done on belowground biomass. Here we showed that, for aboveground, belowground and total biomass of three study species, μ was positively correlated with N∶C under N limitation and positively correlated with P∶C under P limitation. However, the N∶P ratio was a unimodal function of μ, increasing for small values of μ, reaching a maximum, and then decreasing. The range of variations in μ was positively correlated with variation in C∶N∶P stoichiometry. Furthermore, μ and C∶N∶P ranges for aboveground biomass were negatively correlated with those for belowground. Our results confirm the well-known association of growth rate with tissue concentration of the limiting nutrient and provide empirical support for recent theoretical formulations.

ContributorsYu, Qiang (Author) / Wu, Honghui (Author) / He, Nianpeng (Author) / Lu, Xiaotao (Author) / Wang, Zhiping (Author) / Elser, James (Author) / Wu, Jianguo (Author) / Han, Xingguo (Author) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor) / Julie Ann Wrigley Global Institute of Sustainability (Contributor) / School of Sustainability (Contributor)

Created2012-03-13

Grasshoppers Regulate N: P Stoichiometric Homeostasis by Changing Phosphorus Contents in Their Frass

Description

Nitrogen (N) and phosphorus (P) are important limiting nutrients for plant production and consumer performance in a variety of ecosystems. As a result, the N:P stoichiometry of herbivores has received increased attention in ecology. However, the mechanisms by which herbivores maintain N:P stoichiometric homeostasis are poorly understood. Here, using a…

Nitrogen (N) and phosphorus (P) are important limiting nutrients for plant production and consumer performance in a variety of ecosystems. As a result, the N:P stoichiometry of herbivores has received increased attention in ecology. However, the mechanisms by which herbivores maintain N:P stoichiometric homeostasis are poorly understood. Here, using a field manipulation experiment we show that the grasshopper Oedaleus asiaticus maintains strong N:P stoichiometric homeostasis regardless of whether grasshoppers were reared at low or high density. Grasshoppers maintained homeostasis by increasing P excretion when eating plants with higher P contents. However, while grasshoppers also maintained constant body N contents, we found no changes in N excretion in response to changing plant N content over the range measured. These results suggest that O. asiaticus maintains P homeostasis primarily by changing P absorption and excretion rates, but that other mechanisms may be more important for regulating N homeostasis. Our findings improve our understanding of consumer-driven P recycling and may help in understanding the factors affecting plant-herbivore interactions and ecosystem processes in grasslands.

ContributorsZhang, Zijia (Author) / Elser, James (Author) / Cease, Arianne (Author) / Zhang, Ximei (Author) / Yu, Qiang (Author) / Han, Xingguo (Author) / Zhang, Guangming (Author) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor) / Julie Ann Wrigley Global Institute of Sustainability (Contributor) / School of Sustainability (Contributor)

Created2014-08-04

The Evolutionary History of Amino Acid Variations Mediating Increased Resistance of S. aureus Identifies Reversion Mutations in Metabolic Regulators

Description

The evolution of resistance in Staphylococcus aureus occurs rapidly, and in response to all known antimicrobial treatments. Numerous studies of model species describe compensatory roles of mutations in mediating competitive fitness, and there is growing evidence that these mutation types also drive adaptation of S. aureus strains. However, few studies…

The evolution of resistance in Staphylococcus aureus occurs rapidly, and in response to all known antimicrobial treatments. Numerous studies of model species describe compensatory roles of mutations in mediating competitive fitness, and there is growing evidence that these mutation types also drive adaptation of S. aureus strains. However, few studies have tracked amino acid changes during the complete evolutionary trajectory of antibiotic adaptation or been able to predict their functional relevance. Here, we have assessed the efficacy of computational methods to predict biological resistance of a collection of clinically known Resistance Associated Mutations (RAMs). We have found that >90% of known RAMs are incorrectly predicted to be functionally neutral by at least one of the prediction methods used. By tracing the evolutionary histories of all of the false negative RAMs, we have discovered that a significant number are reversion mutations to ancestral alleles also carried in the MSSA476 methicillin-sensitive isolate. These genetic reversions are most prevalent in strains following daptomycin treatment and show a tendency to accumulate in biological pathway reactions that are distinct from those accumulating non-reversion mutations. Our studies therefore show that in addition to non-reversion mutations, reversion mutations arise in isolates exposed to new antibiotic treatments. It is possible that acquisition of reversion mutations in the genome may prevent substantial fitness costs during the progression of resistance. Our findings pose an interesting question to be addressed by further clinical studies regarding whether or not these reversion mutations lead to a renewed vulnerability of a vancomycin or daptomycin resistant strain to antibiotics administered at an earlier stage of infection.

ContributorsChampion, Mia (Author) / Gray, Vanessa (Author) / Eberhard, Carl (Author) / Kumar, Sudhir (Author) / Biodesign Institute (Contributor) / Center for Evolution and Medicine (Contributor)

Created2013-02-12

ASU Regents' Professors Open Access Works

Filtering by

Image-level and group-level models for Drosophila gene expression pattern annotation

GRASP [Genomic Resource Access for Stoichioproteomics]: comparative explorations of the atomic content of 12 Drosophila proteomes

A Bag-of-Words Approach for Drosophila Gene Expression Pattern Annotation

Exceptional reduction of the plastid genome of saguaro cactus (Carnegiea gigantea): Loss of the ndh gene suite and inverted repeat

Evolutionary Diagnosis of Non-Synonymous Variants Involved in Differential Drug Response

Conductance fluctuations in high mobility monolayer graphene: Nonergodicity, lack of determinism and chaotic behavior

Introduction: The Continued Importance of Smallholders Today

Testing the Growth Rate Hypothesis in Vascular Plants with Above- and Below-Ground Biomass

Grasshoppers Regulate N: P Stoichiometric Homeostasis by Changing Phosphorus Contents in Their Frass

The Evolutionary History of Amino Acid Variations Mediating Increased Resistance of S. aureus Identifies Reversion Mutations in Metabolic Regulators