Search Content

Image-level and group-level models for Drosophila gene expression pattern annotation

Description

Background
Drosophila melanogaster has been established as a model organism for investigating the developmental gene interactions. The spatio-temporal gene expression patterns of Drosophila melanogaster can be visualized by in situ hybridization and documented as digital images. Automated and efficient tools for analyzing these expression images will provide biological insights into the…

Background
Drosophila melanogaster has been established as a model organism for investigating the developmental gene interactions. The spatio-temporal gene expression patterns of Drosophila melanogaster can be visualized by in situ hybridization and documented as digital images. Automated and efficient tools for analyzing these expression images will provide biological insights into the gene functions, interactions, and networks. To facilitate pattern recognition and comparison, many web-based resources have been created to conduct comparative analysis based on the body part keywords and the associated images. With the fast accumulation of images from high-throughput techniques, manual inspection of images will impose a serious impediment on the pace of biological discovery. It is thus imperative to design an automated system for efficient image annotation and comparison.
Results
We present a computational framework to perform anatomical keywords annotation for Drosophila gene expression images. The spatial sparse coding approach is used to represent local patches of images in comparison with the well-known bag-of-words (BoW) method. Three pooling functions including max pooling, average pooling and Sqrt (square root of mean squared statistics) pooling are employed to transform the sparse codes to image features. Based on the constructed features, we develop both an image-level scheme and a group-level scheme to tackle the key challenges in annotating Drosophila gene expression pattern images automatically. To deal with the imbalanced data distribution inherent in image annotation tasks, the undersampling method is applied together with majority vote. Results on Drosophila embryonic expression pattern images verify the efficacy of our approach.
Conclusion
In our experiment, the three pooling functions perform comparably well in feature dimension reduction. The undersampling with majority vote is shown to be effective in tackling the problem of imbalanced data. Moreover, combining sparse coding and image-level scheme leads to consistent performance improvement in keywords annotation.

ContributorsSun, Qian (Author) / Muckatira, Sherin (Author) / Yuan, Lei (Author) / Ji, Shuiwang (Author) / Newfeld, Stuart (Author) / Kumar, Sudhir (Author) / Ye, Jieping (Author) / Biodesign Institute (Contributor) / Center for Evolution and Medicine (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor) / Ira A. Fulton Schools of Engineering (Contributor)

Created2013-12-03

GRASP [Genomic Resource Access for Stoichioproteomics]: comparative explorations of the atomic content of 12 Drosophila proteomes

Description

Background
“Stoichioproteomics” relates the elemental composition of proteins and proteomes to variation in the physiological and ecological environment. To help harness and explore the wealth of hypotheses made possible under this framework, we introduce GRASP (http://www.graspdb.net), a public bioinformatic knowledgebase containing information on the frequencies of 20 amino acids and atomic…

Background
“Stoichioproteomics” relates the elemental composition of proteins and proteomes to variation in the physiological and ecological environment. To help harness and explore the wealth of hypotheses made possible under this framework, we introduce GRASP (http://www.graspdb.net), a public bioinformatic knowledgebase containing information on the frequencies of 20 amino acids and atomic composition of their side chains. GRASP integrates comparative protein composition data with annotation data from multiple public databases. Currently, GRASP includes information on proteins of 12 sequenced Drosophila (fruit fly) proteomes, which will be expanded to include increasingly diverse organisms over time. In this paper we illustrate the potential of GRASP for testing stoichioproteomic hypotheses by conducting an exploratory investigation into the composition of 12 Drosophila proteomes, testing the prediction that protein atomic content is associated with species ecology and with protein expression levels.
Results
Elements varied predictably along multivariate axes. Species were broadly similar, with the D. willistoni proteome a clear outlier. As expected, individual protein atomic content within proteomes was influenced by protein function and amino acid biochemistry. Evolution in elemental composition across the phylogeny followed less predictable patterns, but was associated with broad ecological variation in diet. Using expression data available for D. melanogaster, we found evidence consistent with selection for efficient usage of elements within the proteome: as expected, nitrogen content was reduced in highly expressed proteins in most tissues, most strongly in the gut, where nutrients are assimilated, and least strongly in the germline.
Conclusions
The patterns identified here using GRASP provide a foundation on which to base future research into the evolution of atomic composition in Drosophila and other taxa.

ContributorsGilbert, James D. J. (Author) / Acquisti, Claudia (Author) / Martinson, Holly M. (Author) / Elser, James (Author) / Kumar, Sudhir (Author) / Fagan, William F. (Author) / Biodesign Institute (Contributor) / Center for Evolution and Medicine (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor)

Created2013-09-04

A composite genome approach to identify phylogenetically informative data from next-generation sequencing

Description

Background
Improvements in sequencing technology now allow easy acquisition of large datasets; however, analyzing these data for phylogenetics can be challenging. We have developed a novel method to rapidly obtain homologous genomic data for phylogenetics directly from next-generation sequencing reads without the use of a reference genome. This software, called SISRS,…

Background
Improvements in sequencing technology now allow easy acquisition of large datasets; however, analyzing these data for phylogenetics can be challenging. We have developed a novel method to rapidly obtain homologous genomic data for phylogenetics directly from next-generation sequencing reads without the use of a reference genome. This software, called SISRS, avoids the time consuming steps of de novo whole genome assembly, multiple genome alignment, and annotation.
Results
For simulations SISRS is able to identify large numbers of loci containing variable sites with phylogenetic signal. For genomic data from apes, SISRS identified thousands of variable sites, from which we produced an accurate phylogeny. Finally, we used SISRS to identify phylogenetic markers that we used to estimate the phylogeny of placental mammals. We recovered eight phylogenies that resolved the basal relationships among mammals using datasets with different levels of missing data. The three alternate resolutions of the basal relationships are consistent with the major hypotheses for the relationships among mammals, all of which have been supported previously by different molecular datasets.
Conclusions
SISRS has the potential to transform phylogenetic research. This method eliminates the need for expensive marker development in many studies by using whole genome shotgun sequence data directly. SISRS is open source and freely available at https://github.com/rachelss/SISRS/releases.

ContributorsSchwartz, Rachel (Author) / Harkins, Kelly (Author) / Stone, Anne (Author) / Cartwright, Reed (Author) / Biodesign Institute (Contributor) / Center for Evolution and Medicine (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Human Evolution and Social Change (Contributor) / School of Life Sciences (Contributor)

Created2015-06-11

A Bag-of-Words Approach for Drosophila Gene Expression Pattern Annotation

Description

Background:
Drosophila gene expression pattern images document the spatiotemporal dynamics of gene expression during embryogenesis. A comparative analysis of these images could provide a fundamentally important way for studying the regulatory networks governing development. To facilitate pattern comparison and searching, groups of images in the Berkeley Drosophila Genome Project (BDGP) high-throughput…

Background:
Drosophila gene expression pattern images document the spatiotemporal dynamics of gene expression during embryogenesis. A comparative analysis of these images could provide a fundamentally important way for studying the regulatory networks governing development. To facilitate pattern comparison and searching, groups of images in the Berkeley Drosophila Genome Project (BDGP) high-throughput study were annotated with a variable number of anatomical terms manually using a controlled vocabulary. Considering that the number of available images is rapidly increasing, it is imperative to design computational methods to automate this task.

Results:
We present a computational method to annotate gene expression pattern images automatically. The proposed method uses the bag-of-words scheme to utilize the existing information on pattern annotation and annotates images using a model that exploits correlations among terms. The proposed method can annotate images individually or in groups (e.g., according to the developmental stage). In addition, the proposed method can integrate information from different two-dimensional views of embryos. Results on embryonic patterns from BDGP data demonstrate that our method significantly outperforms other methods.

Conclusion:
The proposed bag-of-words scheme is effective in representing a set of annotations assigned to a group of images, and the model employed to annotate images successfully captures the correlations among different controlled vocabulary terms. The integration of existing annotation information from multiple embryonic views improves annotation performance.

ContributorsJi, Shuiwang (Author) / Li, Ying-Xin (Author) / Zhou, Zhi-Hua (Author) / Kumar, Sudhir (Author) / Ye, Jieping (Author) / Biodesign Institute (Contributor) / Ira A. Fulton Schools of Engineering (Contributor) / School of Electrical, Computer and Energy Engineering (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor)

Created2009-04-21

Exceptional reduction of the plastid genome of saguaro cactus (Carnegiea gigantea): Loss of the ndh gene suite and inverted repeat

Description

Premise of the study: Land-plant plastid genomes have only rarely undergone significant changes in gene content and order. Thus, discovery of additional examples adds power to tests for causes of such genome-scale structural changes.
Methods: Using next-generation sequence data, we assembled the plastid genome of saguaro cactus and probed the nuclear…

Premise of the study: Land-plant plastid genomes have only rarely undergone significant changes in gene content and order. Thus, discovery of additional examples adds power to tests for causes of such genome-scale structural changes.
Methods: Using next-generation sequence data, we assembled the plastid genome of saguaro cactus and probed the nuclear genome for transferred plastid genes and functionally related nuclear genes. We combined these results with available data across Cactaceae and seed plants more broadly to infer the history of gene loss and to assess the strength of phylogenetic association between gene loss and loss of the inverted repeat (IR).
Key results: The saguaro plastid genome is the smallest known for an obligately photosynthetic angiosperm (∼113 kb), having lost the IR and plastid ndh genes. This loss supports a statistically strong association across seed plants between the loss of ndh genes and the loss of the IR. Many nonplastid copies of plastid ndh genes were found in the nuclear genome, but none had intact reading frames; nor did three related nuclear-encoded subunits. However, nuclear pgr5, which functions in a partially redundant pathway, was intact.
Conclusions: The existence of an alternative pathway redundant with the function of the plastid NADH dehydrogenase-like complex (NDH) complex may permit loss of the plastid ndh gene suite in photoautotrophs like saguaro. Loss of these genes may be a recurring mechanism for overall plastid genome size reduction, especially in combination with loss of the IR.

ContributorsSanderson, Michael J. (Author) / Copetti, Dario (Author) / Burquez, Alberto (Author) / Bustamante, Enriquena (Author) / Charboneau, Joseph L. M. (Author) / Eguiarte, Luis E. (Author) / Kumar, Sudhir (Author) / Lee, Hyun Oh (Author) / Lee, Junki (Author) / McMahon, Michelle (Author) / Steele, Kelly (Author) / Wing, Rod (Author) / Yang, Tae-Jin (Author) / Zwickl, Derrick (Author) / Wojciechowski, Martin (Author) / College of Integrative Sciences and Arts (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor)

Created2015-07-01

East of the Wind and West of the Rain

Description

There are places that rest tangibly on the Earth's surface, and places that flourish only in the imagination, and places that site their existence within a moral geography, and a few places, not many, Bor Island among them, that manage to fuse all these settings together. In truth, Bor belongs…

There are places that rest tangibly on the Earth's surface, and places that flourish only in the imagination, and places that site their existence within a moral geography, and a few places, not many, Bor Island among them, that manage to fuse all these settings together. In truth, Bor belongs with that long tradition of island Arcadias that have attracted Western thinkers since well before Thomas More in 1516 gave them the name they now have: Utopia. What makes Bor Island unique is that its informing theme is fire.

ContributorsPyne, Stephen (Author) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor)

Created2014-11-30

Parent-Adolescent Conflict as Sequences of Reciprocal Negative Emotion: Links with Conflict Resolution and Adolescents' Behavior Problems

Description

Although conflict is a normative part of parent–adolescent relationships, conflicts that are long or highly negative are likely to be detrimental to these relationships and to youths’ development. In the present article, sequential analyses of data from 138 parent–adolescent dyads (adolescents’ mean age was 13.44, SD = 1.16; 52 %…

Although conflict is a normative part of parent–adolescent relationships, conflicts that are long or highly negative are likely to be detrimental to these relationships and to youths’ development. In the present article, sequential analyses of data from 138 parent–adolescent dyads (adolescents’ mean age was 13.44, SD = 1.16; 52 % girls, 79 % non-Hispanic White) were used to define conflicts as reciprocal exchanges of negative emotion observed while parents and adolescents were discussing “hot,” conflictual issues. Dynamic components of these exchanges, including who started the conflicts, who ended them, and how long they lasted, were identified. Mediation analyses revealed that a high proportion of conflicts ended by adolescents was associated with longer conflicts, which in turn predicted perceptions of the “hot” issue as unresolved and adolescent behavior problems. The findings illustrate advantages of using sequential analysis to identify patterns of interactions and, with some certainty, obtain an estimate of the contingent relationship between a pattern of behavior and child and parental outcomes. These interaction patterns are discussed in terms of the roles that parents and children play when in conflict with each other, and the processes through which these roles affect conflict resolution and adolescents’ behavior problems.

ContributorsMoed, Anat (Author) / Gershoff, Elizabeth T. (Author) / Eisenberg, Nancy (Author) / Hofer, Claire (Author) / Losoya, Sandra (Author) / Spinrad, Tracy (Author) / Liew, Jeffrey (Author) / College of Liberal Arts and Sciences (Contributor) / Department of Psychology (Contributor) / Sanford School of Social and Family Dynamics (Contributor)

Created2015-08-01

Interactions among catechol-O-methyltransferase genotype, parenting, and sex predict children's internalizing symptoms and inhibitory control: Evidence for differential susceptibility

Description

We used sex, observed parenting quality at 18 months, and three variants of the catechol-O-methyltransferase gene (Val[superscript 158]Met [rs4680], intron1 [rs737865], and 3′-untranslated region [rs165599]) to predict mothers' reports of inhibitory and attentional control (assessed at 42, 54, 72, and 84 months) and internalizing symptoms (assessed at 24, 30, 42,…

We used sex, observed parenting quality at 18 months, and three variants of the catechol-O-methyltransferase gene (Val[superscript 158]Met [rs4680], intron1 [rs737865], and 3′-untranslated region [rs165599]) to predict mothers' reports of inhibitory and attentional control (assessed at 42, 54, 72, and 84 months) and internalizing symptoms (assessed at 24, 30, 42, 48, and 54 months) in a sample of 146 children (79 male). Although the pattern for all three variants was very similar, Val[superscript 158]Met explained more variance in both outcomes than did intron1, the 3′-untranslated region, or a haplotype that combined all three catechol-O-methyltransferase variants. In separate models, there were significant three-way interactions among each of the variants, parenting, and sex, predicting the intercepts of inhibitory control and internalizing symptoms. Results suggested that Val[superscript 158]Met indexes plasticity, although this effect was moderated by sex. Parenting was positively associated with inhibitory control for methionine–methionine boys and for valine–valine/valine–methionine girls, and was negatively associated with internalizing symptoms for methionine–methionine boys. Using the “regions of significance” technique, genetic differences in inhibitory control were found for children exposed to high-quality parenting, whereas genetic differences in internalizing were found for children exposed to low-quality parenting. These findings provide evidence in support of testing for differential susceptibility across multiple outcomes.

ContributorsSulik, Michael (Author) / Eisenberg, Nancy (Author) / Spinrad, Tracy (Author) / Lemery, Kathryn (Author) / Swann, Gregory (Author) / Silva, Kassondra (Author) / Reiser, Mark (Author) / Stover, Daryn (Author) / Verrelli, Brian (Author) / College of Liberal Arts and Sciences (Contributor) / Department of Psychology (Contributor) / Sanford School of Social and Family Dynamics (Contributor) / School of Mathematical and Statistical Sciences (Contributor)

Created2015-08-01

On the Factor Structure of the Rosenberg (1965) General Self-Esteem Scale

Description

Since its introduction, the Rosenberg General Self-Esteem Scale (RGSE, Rosenberg, 1965) has been 1 of the most widely used measures of global self-esteem. We conducted 4 studies to investigate (a) the goodness-of-fit of a bifactor model positing a general self-esteem (GSE) factor and 2 specific factors grouping positive (MFP) and…

Since its introduction, the Rosenberg General Self-Esteem Scale (RGSE, Rosenberg, 1965) has been 1 of the most widely used measures of global self-esteem. We conducted 4 studies to investigate (a) the goodness-of-fit of a bifactor model positing a general self-esteem (GSE) factor and 2 specific factors grouping positive (MFP) and negative items (MFN) and (b) different kinds of validity of the GSE, MFN, and MFP factors of the RSGE. In the first study (n = 11,028), the fit of the bifactor model was compared with those of 9 alternative models proposed in literature for the RGSE. In Study 2 (n = 357), the external validities of GSE, MFP, and MFN were evaluated using objective grade point average data and multimethod measures of prosociality, aggression, and depression. In Study 3 (n = 565), the across-rater robustness of the bifactor model was evaluated. In Study 4, measurement invariance of the RGSE was further supported across samples in 3 European countries, Serbia (n = 1,010), Poland (n = 699), and Italy (n = 707), and in the United States (n = 1,192). All in all, psychometric findings corroborate the value and the robustness of the bifactor structure and its substantive interpretation.

ContributorsAlessandri, Guido (Author) / Vecchione, Michele (Author) / Eisenberg, Nancy (Author) / Laguna, Mariola (Author) / College of Liberal Arts and Sciences (Contributor) / Department of Psychology (Contributor)

Created2015-06-01

Merging Economics and Epidemiology to Improve the Prediction and Management of Infectious Disease

Description

Mathematical epidemiology, one of the oldest and richest areas in mathematical biology, has significantly enhanced our understanding of how pathogens emerge, evolve, and spread. Classical epidemiological models, the standard for predicting and managing the spread of infectious disease, assume that contacts between susceptible and infectious individuals depend on their relative…

Mathematical epidemiology, one of the oldest and richest areas in mathematical biology, has significantly enhanced our understanding of how pathogens emerge, evolve, and spread. Classical epidemiological models, the standard for predicting and managing the spread of infectious disease, assume that contacts between susceptible and infectious individuals depend on their relative frequency in the population. The behavioral factors that underpin contact rates are not generally addressed. There is, however, an emerging a class of models that addresses the feedbacks between infectious disease dynamics and the behavioral decisions driving host contact. Referred to as “economic epidemiology” or “epidemiological economics,” the approach explores the determinants of decisions about the number and type of contacts made by individuals, using insights and methods from economics. We show how the approach has the potential both to improve predictions of the course of infectious disease, and to support development of novel approaches to infectious disease management.

ContributorsPerrings, Charles (Author) / Castillo-Chavez, Carlos (Author) / Chowell-Puente, Gerardo (Author) / Daszak, Peter (Author) / Fenichel, Eli P. (Author) / Finnoff, David (Author) / Horan, Richard D. (Author) / Kilpatrick, A. Marm (Author) / Kinzig, Ann (Author) / Kuminoff, Nicolai (Author) / Levin, Simon (Author) / Morin, Benjamin (Author) / Smith, Katherine F. (Author) / Springborn, Michael (Author) / Simon M. Levin Mathematical, Computational and Modeling Sciences Center (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor) / School of Human Evolution and Social Change (Contributor) / W.P. Carey School of Business (Contributor) / Economics (Contributor) / Julie Ann Wrigley Global Institute of Sustainability (Contributor)

Created2015-12-01

Filtering by