Search Content

Image-level and group-level models for Drosophila gene expression pattern annotation

Description

Background
Drosophila melanogaster has been established as a model organism for investigating the developmental gene interactions. The spatio-temporal gene expression patterns of Drosophila melanogaster can be visualized by in situ hybridization and documented as digital images. Automated and efficient tools for analyzing these expression images will provide biological insights into the…

Background
Drosophila melanogaster has been established as a model organism for investigating the developmental gene interactions. The spatio-temporal gene expression patterns of Drosophila melanogaster can be visualized by in situ hybridization and documented as digital images. Automated and efficient tools for analyzing these expression images will provide biological insights into the gene functions, interactions, and networks. To facilitate pattern recognition and comparison, many web-based resources have been created to conduct comparative analysis based on the body part keywords and the associated images. With the fast accumulation of images from high-throughput techniques, manual inspection of images will impose a serious impediment on the pace of biological discovery. It is thus imperative to design an automated system for efficient image annotation and comparison.
Results
We present a computational framework to perform anatomical keywords annotation for Drosophila gene expression images. The spatial sparse coding approach is used to represent local patches of images in comparison with the well-known bag-of-words (BoW) method. Three pooling functions including max pooling, average pooling and Sqrt (square root of mean squared statistics) pooling are employed to transform the sparse codes to image features. Based on the constructed features, we develop both an image-level scheme and a group-level scheme to tackle the key challenges in annotating Drosophila gene expression pattern images automatically. To deal with the imbalanced data distribution inherent in image annotation tasks, the undersampling method is applied together with majority vote. Results on Drosophila embryonic expression pattern images verify the efficacy of our approach.
Conclusion
In our experiment, the three pooling functions perform comparably well in feature dimension reduction. The undersampling with majority vote is shown to be effective in tackling the problem of imbalanced data. Moreover, combining sparse coding and image-level scheme leads to consistent performance improvement in keywords annotation.

ContributorsSun, Qian (Author) / Muckatira, Sherin (Author) / Yuan, Lei (Author) / Ji, Shuiwang (Author) / Newfeld, Stuart (Author) / Kumar, Sudhir (Author) / Ye, Jieping (Author) / Biodesign Institute (Contributor) / Center for Evolution and Medicine (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor) / Ira A. Fulton Schools of Engineering (Contributor)

Created2013-12-03

GRASP [Genomic Resource Access for Stoichioproteomics]: comparative explorations of the atomic content of 12 Drosophila proteomes

Description

Background
“Stoichioproteomics” relates the elemental composition of proteins and proteomes to variation in the physiological and ecological environment. To help harness and explore the wealth of hypotheses made possible under this framework, we introduce GRASP (http://www.graspdb.net), a public bioinformatic knowledgebase containing information on the frequencies of 20 amino acids and atomic…

Background
“Stoichioproteomics” relates the elemental composition of proteins and proteomes to variation in the physiological and ecological environment. To help harness and explore the wealth of hypotheses made possible under this framework, we introduce GRASP (http://www.graspdb.net), a public bioinformatic knowledgebase containing information on the frequencies of 20 amino acids and atomic composition of their side chains. GRASP integrates comparative protein composition data with annotation data from multiple public databases. Currently, GRASP includes information on proteins of 12 sequenced Drosophila (fruit fly) proteomes, which will be expanded to include increasingly diverse organisms over time. In this paper we illustrate the potential of GRASP for testing stoichioproteomic hypotheses by conducting an exploratory investigation into the composition of 12 Drosophila proteomes, testing the prediction that protein atomic content is associated with species ecology and with protein expression levels.
Results
Elements varied predictably along multivariate axes. Species were broadly similar, with the D. willistoni proteome a clear outlier. As expected, individual protein atomic content within proteomes was influenced by protein function and amino acid biochemistry. Evolution in elemental composition across the phylogeny followed less predictable patterns, but was associated with broad ecological variation in diet. Using expression data available for D. melanogaster, we found evidence consistent with selection for efficient usage of elements within the proteome: as expected, nitrogen content was reduced in highly expressed proteins in most tissues, most strongly in the gut, where nutrients are assimilated, and least strongly in the germline.
Conclusions
The patterns identified here using GRASP provide a foundation on which to base future research into the evolution of atomic composition in Drosophila and other taxa.

ContributorsGilbert, James D. J. (Author) / Acquisti, Claudia (Author) / Martinson, Holly M. (Author) / Elser, James (Author) / Kumar, Sudhir (Author) / Fagan, William F. (Author) / Biodesign Institute (Contributor) / Center for Evolution and Medicine (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor)

Created2013-09-04

Feasibility of three wearable sensors for 24 hour monitoring in middle-aged women

Description

Background
The purpose of this study is to determine the feasibility of three widely used wearable sensors in research settings for 24 h monitoring of sleep, sedentary, and active behaviors in middle-aged women.
Methods
Participants were 21 inactive, overweight (M Body Mass Index (BMI) = 29.27 ± 7.43) women, 30 to 64 years (M = 45.31 ± 9.67). Women were instructed…

Background
The purpose of this study is to determine the feasibility of three widely used wearable sensors in research settings for 24 h monitoring of sleep, sedentary, and active behaviors in middle-aged women.
Methods
Participants were 21 inactive, overweight (M Body Mass Index (BMI) = 29.27 ± 7.43) women, 30 to 64 years (M = 45.31 ± 9.67). Women were instructed to wear each sensor on the non-dominant hip (ActiGraph GT3X+), wrist (GENEActiv), or upper arm (BodyMedia SenseWear Mini) for 24 h/day and record daily wake and bed times for one week over the course of three consecutive weeks. Women received feedback about their daily physical activity and sleep behaviors. Feasibility (i.e., acceptability and demand) was measured using surveys, interviews, and wear time.
Results
Women felt the GENEActiv (94.7 %) and SenseWear Mini (90.0 %) were easier to wear and preferred the placement (68.4, 80 % respectively) as compared to the ActiGraph (42.9, 47.6 % respectively). Mean wear time on valid days was similar across sensors (ActiGraph: M = 918.8 ± 115.0 min; GENEActiv: M = 949.3 ± 86.6; SenseWear: M = 928.0 ± 101.8) and well above other studies using wake time only protocols. Informational feedback was the biggest motivator, while appearance, comfort, and inconvenience were the biggest barriers to wearing sensors. Wear time was valid on 93.9 % (ActiGraph), 100 % (GENEActiv), and 95.2 % (SenseWear) of eligible days. 61.9, 95.2, and 71.4 % of participants had seven valid days of data for the ActiGraph, GENEActiv, and SenseWear, respectively.
Conclusion
Twenty-four hour monitoring over seven consecutive days is a feasible approach in middle-aged women. Researchers should consider participant acceptability and demand, in addition to validity and reliability, when choosing a wearable sensor. More research is needed across populations and study designs.

ContributorsHuberty, Jennifer (Author) / Ehlers, Diane (Author) / Kurka, Jonathan (Author) / Ainsworth, Barbara (Author) / Buman, Matthew (Author) / College of Health Solutions (Contributor) / School of Nutrition and Health Promotion (Contributor)

Created2015-07-30

A composite genome approach to identify phylogenetically informative data from next-generation sequencing

Description

Background
Improvements in sequencing technology now allow easy acquisition of large datasets; however, analyzing these data for phylogenetics can be challenging. We have developed a novel method to rapidly obtain homologous genomic data for phylogenetics directly from next-generation sequencing reads without the use of a reference genome. This software, called SISRS,…

Background
Improvements in sequencing technology now allow easy acquisition of large datasets; however, analyzing these data for phylogenetics can be challenging. We have developed a novel method to rapidly obtain homologous genomic data for phylogenetics directly from next-generation sequencing reads without the use of a reference genome. This software, called SISRS, avoids the time consuming steps of de novo whole genome assembly, multiple genome alignment, and annotation.
Results
For simulations SISRS is able to identify large numbers of loci containing variable sites with phylogenetic signal. For genomic data from apes, SISRS identified thousands of variable sites, from which we produced an accurate phylogeny. Finally, we used SISRS to identify phylogenetic markers that we used to estimate the phylogeny of placental mammals. We recovered eight phylogenies that resolved the basal relationships among mammals using datasets with different levels of missing data. The three alternate resolutions of the basal relationships are consistent with the major hypotheses for the relationships among mammals, all of which have been supported previously by different molecular datasets.
Conclusions
SISRS has the potential to transform phylogenetic research. This method eliminates the need for expensive marker development in many studies by using whole genome shotgun sequence data directly. SISRS is open source and freely available at https://github.com/rachelss/SISRS/releases.

ContributorsSchwartz, Rachel (Author) / Harkins, Kelly (Author) / Stone, Anne (Author) / Cartwright, Reed (Author) / Biodesign Institute (Contributor) / Center for Evolution and Medicine (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Human Evolution and Social Change (Contributor) / School of Life Sciences (Contributor)

Created2015-06-11

Evolutionary Diagnosis of Non-Synonymous Variants Involved in Differential Drug Response

Description

Background:
Many pharmaceutical drugs are known to be ineffective or have negative side effects in a substantial proportion of patients. Genomic advances are revealing that some non-synonymous single nucleotide variants (nsSNVs) may cause differences in drug efficacy and side effects. Therefore, it is desirable to evaluate nsSNVs of interest in their…

Background:
Many pharmaceutical drugs are known to be ineffective or have negative side effects in a substantial proportion of patients. Genomic advances are revealing that some non-synonymous single nucleotide variants (nsSNVs) may cause differences in drug efficacy and side effects. Therefore, it is desirable to evaluate nsSNVs of interest in their ability to modulate the drug response.

Results:
We found that the available data on the link between drug response and nsSNV is rather modest. There were only 31 distinct drug response-altering (DR-altering) and 43 distinct drug response-neutral (DR-neutral) nsSNVs in the whole Pharmacogenomics Knowledge Base (PharmGKB). However, even with this modest dataset, it was clear that existing bioinformatics tools have difficulties in correctly predicting the known DR-altering and DR-neutral nsSNVs. They exhibited an overall accuracy of less than 50%, which was not better than random diagnosis. We found that the underlying problem is the markedly different evolutionary properties between positions harboring nsSNVs linked to drug responses and those observed for inherited diseases. To solve this problem, we developed a new diagnosis method, Drug-EvoD, which was trained on the evolutionary properties of nsSNVs associated with drug responses in a sparse learning framework. Drug-EvoD achieves a TPR of 84% and a TNR of 53%, with a balanced accuracy of 69%, which improves upon other methods significantly.

Conclusions:
The new tool will enable researchers to computationally identify nsSNVs that may affect drug responses. However, much larger training and testing datasets are needed to develop more reliable and accurate tools.

ContributorsGerek, Nevin Z. (Author) / Liu, Li (Author) / Gerold, Kristyn (Author) / Biparva, Pegah (Author) / Thomas, Eric D. (Author) / Kumar, Sudhir (Author) / Biodesign Institute (Contributor) / Center for Evolution and Medicine (Contributor)

Created2015-01-15

Open Geospatial Analytics with PySAL

Description

This article reviews the range of delivery platforms that have been developed for the PySAL open source Python library for spatial analysis. This includes traditional desktop software (with a graphical user interface, command line or embedded in a computational notebook), open spatial analytics middleware, and web, cloud and distributed open…

This article reviews the range of delivery platforms that have been developed for the PySAL open source Python library for spatial analysis. This includes traditional desktop software (with a graphical user interface, command line or embedded in a computational notebook), open spatial analytics middleware, and web, cloud and distributed open geospatial analytics for decision support. A common thread throughout the discussion is the emphasis on openness, interoperability, and provenance management in a scientific workflow. The code base of the PySAL library provides the common computing framework underlying all delivery mechanisms.

ContributorsRey, Sergio (Author) / Anselin, Luc (Author) / Li, Xun (Author) / Pahle, Robert (Author) / Laura, Jason (Author) / Li, Wenwen (Author) / Koschinsky, Julia (Author) / College of Liberal Arts and Sciences (Contributor) / School of Geographical Sciences and Urban Planning (Contributor) / Computational Spatial Science (Contributor)

Created2015-06-01

Combined Influences of Model Choice, Data Quality, and Data Quantity When Estimating Population Trends

Description

Estimating and projecting population trends using population viability analysis (PVA) are central to identifying species at risk of extinction and for informing conservation management strategies. Models for PVA generally fall within two categories, scalar (count-based) or matrix (demographic). Model structure, process error, measurement error, and time series length all have…

Estimating and projecting population trends using population viability analysis (PVA) are central to identifying species at risk of extinction and for informing conservation management strategies. Models for PVA generally fall within two categories, scalar (count-based) or matrix (demographic). Model structure, process error, measurement error, and time series length all have known impacts in population risk assessments, but their combined impact has not been thoroughly investigated. We tested the ability of scalar and matrix PVA models to predict percent decline over a ten-year interval, selected to coincide with the IUCN Red List criterion A. 3, using data simulated for a hypothetical, short-lived organism with a simple life-history and for a threatened snail, Tasmaphena lamproides. PVA performance was assessed across different time series lengths, population growth rates, and levels of process and measurement error. We found that the magnitude of effects of measurement error, process error, and time series length, and interactions between these, depended on context. We found that high process and measurement error reduced the reliability of both models in predicted percent decline. Both sources of error contributed strongly to biased predictions, with process error tending to contribute to the spread of predictions more than measurement error. Increasing time series length improved precision and reduced bias of predicted population trends, but gains substantially diminished for time series lengths greater than 10-15 years. The simple parameterization scheme we employed contributed strongly to bias in matrix model predictions when both process and measurement error were high, causing scalar models to exhibit similar or greater precision and lower bias than matrix models. Our study provides evidence that, for short-lived species with structured but simple life histories, short time series and simple models can be sufficient for reasonably reliable conservation decision-making, and may be preferable for population projections when unbiased estimates of vital rates cannot be obtained.

ContributorsRueda-Cediel, Pamela (Author) / Anderson, Kurt E. (Author) / Regan, Tracey J. (Author) / Franklin, Janet (Author) / Regan, Helen M. (Author) / College of Liberal Arts and Sciences (Contributor) / School of Geographical Sciences and Urban Planning (Contributor)

Created2015-07-15

Changes in a West Indian Bird Community Since the Late Pleistocene

Description

Aim
To establish a chronology for late Quaternary avian extinction, extirpation and persistence in the Bahamas, thereby testing the relative roles of climate change and human impact as causes of extinction.
Location
Great Abaco Island (Abaco), Bahamas, West Indies.
Methods
We analysed the resident bird community as sampled by Pleistocene (> 11.7 ka) and Holocene…

Aim
To establish a chronology for late Quaternary avian extinction, extirpation and persistence in the Bahamas, thereby testing the relative roles of climate change and human impact as causes of extinction.
Location
Great Abaco Island (Abaco), Bahamas, West Indies.
Methods
We analysed the resident bird community as sampled by Pleistocene (> 11.7 ka) and Holocene (< 11.7 ka) fossils. Each species was classified as extinct (lost globally), extirpated (gone from Abaco but persists elsewhere), or extant (still resident on Abaco). We compared patterns of extinction, extirpation and persistence to independent estimates of climate and sea level for glacial (late Pleistocene) and interglacial (Holocene) times.
Results
Of 45 bird species identified in Pleistocene fossils, 25 (56%) no longer occur on Abaco (21 extirpated, 4 extinct). Of 37 species recorded in Holocene deposits, 15 (14 extirpated, 1 extinct; total 41%) no longer exist on Abaco. Of the 30 extant species, 12 were recovered as both Pleistocene and Holocene fossils, as were 9 of the 30 extirpated or extinct species. Most of the extinct or extirpated species that were only recorded from Pleistocene contexts are characteristic of open habitats (pine woodlands or grasslands); several of the extirpated species are currently found only where winters are cooler than in the modern or Pleistocene Bahamas. In contrast, most of the extinct or extirpated species recorded from Holocene contexts are habitat generalists.
Main conclusions
The fossil evidence suggests two main times of late Quaternary avian extirpation and extinction in the Bahamas. The first was during the Pleistocene–Holocene transition (PHT; 15–9 ka) and was fuelled by climate change and associated changes in sea level and island area. The second took place during the late Holocene (< 4 ka, perhaps primarily < 1 ka) and can be attributed to human impact. Although some species lost during the PHT are currently found where climates are cooler and drier than in the Bahamas today, a taxonomically and ecologically diverse set of species persisted through that major climate change but did not survive the past millennium of human presence.

ContributorsSteadman, David W. (Author) / Franklin, Janet (Author) / College of Liberal Arts and Sciences (Contributor) / School of Geographical Sciences and Urban Planning (Contributor)

Created2015-03-01

Introduction: The Continued Importance of Smallholders Today

Description

Evolving Earth observation and change detection techniques enable the automatic identification of Land Use and Land Cover Change (LULCC) over a large extent from massive amounts of remote sensing data. It at the same time poses a major challenge in effective organization, representation and modeling of such information. This study…

Evolving Earth observation and change detection techniques enable the automatic identification of Land Use and Land Cover Change (LULCC) over a large extent from massive amounts of remote sensing data. It at the same time poses a major challenge in effective organization, representation and modeling of such information. This study proposes and implements an integrated computational framework to support the modeling, semantic and spatial reasoning of change information with regard to space, time and topology. We first proposed a conceptual model to formally represent the spatiotemporal variation of change data, which is essential knowledge to support various environmental and social studies, such as deforestation and urbanization studies. Then, a spatial ontology was created to encode these semantic spatiotemporal data in a machine-understandable format. Based on the knowledge defined in the ontology and related reasoning rules, a semantic platform was developed to support the semantic query and change trajectory reasoning of areas with LULCC. This semantic platform is innovative, as it integrates semantic and spatial reasoning into a coherent computational and operational software framework to support automated semantic analysis of time series data that can go beyond LULC datasets. In addition, this system scales well as the amount of data increases, validated by a number of experimental results. This work contributes significantly to both the geospatial Semantic Web and GIScience communities in terms of the establishment of the (web-based) semantic platform for collaborative question answering and decision-making.

ContributorsVadjunec, Jacqueline M. (Author) / Radel, Claudia (Author) / Turner II, B. L. (Author) / College of Liberal Arts and Sciences (Contributor) / School of Geographical Sciences and Urban Planning (Contributor) / Julie Ann Wrigley Global Institute of Sustainability (Contributor) / School of Sustainability (Contributor)

Created2016-10-25

A Geospatial Cyberinfrastructure for Urban Economic Analysis and Spatial Decision-Making

Description

Urban economic modeling and effective spatial planning are critical tools towards achieving urban sustainability. However, in practice, many technical obstacles, such as information islands, poor documentation of data and lack of software platforms to facilitate virtual collaboration, are challenging the effectiveness of decision-making processes. In this paper, we report on…

Urban economic modeling and effective spatial planning are critical tools towards achieving urban sustainability. However, in practice, many technical obstacles, such as information islands, poor documentation of data and lack of software platforms to facilitate virtual collaboration, are challenging the effectiveness of decision-making processes. In this paper, we report on our efforts to design and develop a geospatial cyberinfrastructure (GCI) for urban economic analysis and simulation. This GCI provides an operational graphic user interface, built upon a service-oriented architecture to allow (1) widespread sharing and seamless integration of distributed geospatial data; (2) an effective way to address the uncertainty and positional errors encountered in fusing data from diverse sources; (3) the decomposition of complex planning questions into atomic spatial analysis tasks and the generation of a web service chain to tackle such complex problems; and (4) capturing and representing provenance of geospatial data to trace its flow in the modeling task. The Greater Los Angeles Region serves as the test bed. We expect this work to contribute to effective spatial policy analysis and decision-making through the adoption of advanced GCI and to broaden the application coverage of GCI to include urban economic simulations.

ContributorsLi, Wenwen (Author) / Li, Linna (Author) / Goodchild, Michael F. (Author) / Anselin, Luc (Author) / College of Liberal Arts and Sciences (Contributor) / School of Geographical Sciences and Urban Planning (Contributor) / Computational Spatial Science (Contributor)

Created2013-05-21

ASU Regents' Professors Open Access Works

Filtering by

Image-level and group-level models for Drosophila gene expression pattern annotation

GRASP [Genomic Resource Access for Stoichioproteomics]: comparative explorations of the atomic content of 12 Drosophila proteomes

Feasibility of three wearable sensors for 24 hour monitoring in middle-aged women

A composite genome approach to identify phylogenetically informative data from next-generation sequencing

Evolutionary Diagnosis of Non-Synonymous Variants Involved in Differential Drug Response

Open Geospatial Analytics with PySAL

Combined Influences of Model Choice, Data Quality, and Data Quantity When Estimating Population Trends

Changes in a West Indian Bird Community Since the Late Pleistocene

Introduction: The Continued Importance of Smallholders Today

A Geospatial Cyberinfrastructure for Urban Economic Analysis and Spatial Decision-Making