Search Content

A mesh generation and machine learning framework for Drosophilagene expression pattern image analysis

Description

Background
Multicellular organisms consist of cells of many different types that are established during development. Each type of cell is characterized by the unique combination of expressed gene products as a result of spatiotemporal gene regulation. Currently, a fundamental challenge in regulatory biology is to elucidate the gene expression controls that…

Background
Multicellular organisms consist of cells of many different types that are established during development. Each type of cell is characterized by the unique combination of expressed gene products as a result of spatiotemporal gene regulation. Currently, a fundamental challenge in regulatory biology is to elucidate the gene expression controls that generate the complex body plans during development. Recent advances in high-throughput biotechnologies have generated spatiotemporal expression patterns for thousands of genes in the model organism fruit fly Drosophila melanogaster. Existing qualitative methods enhanced by a quantitative analysis based on computational tools we present in this paper would provide promising ways for addressing key scientific questions.
Results
We develop a set of computational methods and open source tools for identifying co-expressed embryonic domains and the associated genes simultaneously. To map the expression patterns of many genes into the same coordinate space and account for the embryonic shape variations, we develop a mesh generation method to deform a meshed generic ellipse to each individual embryo. We then develop a co-clustering formulation to cluster the genes and the mesh elements, thereby identifying co-expressed embryonic domains and the associated genes simultaneously. Experimental results indicate that the gene and mesh co-clusters can be correlated to key developmental events during the stages of embryogenesis we study. The open source software tool has been made available at http://compbio.cs.odu.edu/fly/.
Conclusions
Our mesh generation and machine learning methods and tools improve upon the flexibility, ease-of-use and accuracy of existing methods.

ContributorsZhang, Wenlu (Author) / Feng, Daming (Author) / Li, Rongjian (Author) / Chernikov, Andrey (Author) / Chrisochoides, Nikos (Author) / Osgood, Christopher (Author) / Konikoff, Charlotte (Author) / Newfeld, Stuart (Author) / Kumar, Sudhir (Author) / Ji, Shuiwang (Author) / Biodesign Institute (Contributor) / Center for Evolution and Medicine (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor)

Created2013-12-28

Learning Sparse Representations for Fruit-Fly Gene Expression Pattern Image Annotation and Retrieval

Description

Background
Fruit fly embryogenesis is one of the best understood animal development systems, and the spatiotemporal gene expression dynamics in this process are captured by digital images. Analysis of these high-throughput images will provide novel insights into the functions, interactions, and networks of animal genes governing development. To facilitate comparative analysis,…

Background
Fruit fly embryogenesis is one of the best understood animal development systems, and the spatiotemporal gene expression dynamics in this process are captured by digital images. Analysis of these high-throughput images will provide novel insights into the functions, interactions, and networks of animal genes governing development. To facilitate comparative analysis, web-based interfaces have been developed to conduct image retrieval based on body part keywords and images. Currently, the keyword annotation of spatiotemporal gene expression patterns is conducted manually. However, this manual practice does not scale with the continuously expanding collection of images. In addition, existing image retrieval systems based on the expression patterns may be made more accurate using keywords.
Results
In this article, we adapt advanced data mining and computer vision techniques to address the key challenges in annotating and retrieving fruit fly gene expression pattern images. To boost the performance of image annotation and retrieval, we propose representations integrating spatial information and sparse features, overcoming the limitations of prior schemes.
Conclusions
We perform systematic experimental studies to evaluate the proposed schemes in comparison with current methods. Experimental results indicate that the integration of spatial information and sparse features lead to consistent performance improvement in image annotation, while for the task of retrieval, sparse features alone yields better results.

ContributorsYuan, Lei (Author) / Woodard, Alexander (Author) / Ji, Shuiwang (Author) / Jiang, Yuan (Author) / Zhou, Zhi-Hua (Author) / Kumar, Sudhir (Author) / Ye, Jieping (Author) / Biodesign Institute (Contributor) / Center for Evolution and Medicine (Contributor) / Ira A. Fulton Schools of Engineering (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor)

Created2012-05-23

Coronavirus Envelope Protein Transmembrane Domain: Impact of Positive Charges on Virus-like Particle Assembly

Description

Coronaviruses are a significant group of viruses that cause enteric and respiratory infections in a variety of animals, including humans. Outbreaks of Severe Acute Respiratory Syndrome (SARS) and Middle Eastern Respiratory Syndrome (MERS) in the past 15 years has increased research into coronaviruses to gain an understanding of their structure…

Coronaviruses are a significant group of viruses that cause enteric and respiratory infections in a variety of animals, including humans. Outbreaks of Severe Acute Respiratory Syndrome (SARS) and Middle Eastern Respiratory Syndrome (MERS) in the past 15 years has increased research into coronaviruses to gain an understanding of their structure and function so one day therapies and vaccines may be produced. These viruses have four main structural proteins: the spike, nucleocapsid, envelope, and membrane proteins. The envelope (E) protein is an integral membrane protein in the viral envelope that acts as a viroporin for transport of cations and plays an important role in pathogenesis and viral assembly. E contains a hydrophobic transmembrane domain with polar residues that is conserved across coronavirus species and may be significant to its function. This experiment looks at the possible role of one polar residue in assembly, the 15th residue glutamine, in the Mouse Hepatitis Virus (MHV) E protein. The glutamine 15 residue was mutated into positively charged residues lysine or arginine. Plasmids with these mutations were co-expressed with the membrane protein (M) gene to produce virus-like particles (VLPs). VLPs are produced when E and M are co-expressed together and model assembly of the coronavirus envelope, but they are not infectious as they do not contain the viral genome. Observing their production with the mutated E protein gives insight into the role the glutamine residue plays in assembly. The experiment showed that a changing glutamine 15 to positive charges does not appear to significantly affect the assembly of the VLPs, indicating that this specific residue may not have a large impact on viral assembly.

ContributorsHaller, Sarah S. (Author) / Hogue, Brenda (Thesis director) / Liu, Wei (Committee member) / School of Life Sciences (Contributor) / Barrett, The Honors College (Contributor) / Biodesign Institute (Contributor)

Created2017-05

An Analysis of Craft Labor Productivity

Description

Productivity in the construction industry is an essential measure of production efficiency and economic progress, quantified by craft laborers' time spent directly adding value to a project. In order to better understand craft labor productivity as an aspect of lean construction, an activity analysis was conducted at the Arizona State…

Productivity in the construction industry is an essential measure of production efficiency and economic progress, quantified by craft laborers' time spent directly adding value to a project. In order to better understand craft labor productivity as an aspect of lean construction, an activity analysis was conducted at the Arizona State University Palo Verde Main engineering dormitory construction site in December of 2016. The objective of this analysis on craft labor productivity in construction projects was to gather data regarding the efficiency of craft labor workers, make conclusions about the effects of time of day and other site-specific factors on labor productivity, as well as suggest improvements to implement in the construction process. Analysis suggests that supporting tasks, such as traveling or materials handling, constitute the majority of craft labors' efforts on the job site with the highest percentages occurring at the beginning and end of the work day. Direct work and delays were approximately equal at about 20% each hour with the highest peak occurring at lunchtime between 10:00 am and 11:00 am. The top suggestion to improve construction productivity would be to perform an extensive site utilization analysis due to the confined nature of this job site. Despite the limitations of an activity analysis to provide a complete prospective of all the factors that can affect craft labor productivity as well as the small number of days of data acquisition, this analysis provides a basic overview of the productivity at the Palo Verde Main construction site. Through this research, construction managers can more effectively generate site plans and schedules to increase labor productivity.

ContributorsFord, Emily Lucile (Author) / Grau, David (Thesis director) / Chong, Oswald (Committee member) / Civil, Environmental and Sustainable Engineering Programs (Contributor) / School of International Letters and Cultures (Contributor) / Barrett, The Honors College (Contributor)

Created2016-12

Data and Predictive Analytics for Energy Use

Description

The overall energy consumption around the United States has not been reduced even with the advancement of technology over the past decades. Deficiencies exist between design and actual energy performances. Energy Infrastructure Systems (EIS) are impacted when the amount of energy production cannot be accurately and efficiently forecasted. Inaccurate engineering…

The overall energy consumption around the United States has not been reduced even with the advancement of technology over the past decades. Deficiencies exist between design and actual energy performances. Energy Infrastructure Systems (EIS) are impacted when the amount of energy production cannot be accurately and efficiently forecasted. Inaccurate engineering assumptions can result when there is a lack of understanding on how energy systems can operate in real-world applications. Energy systems are complex, which results in unknown system behaviors, due to an unknown structural system model. Currently, there exists a lack of data mining techniques in reverse engineering, which are needed to develop efficient structural system models. In this project, a new type of reverse engineering algorithm has been applied to a year's worth of energy data collected from an ASU research building called MacroTechnology Works, to identify the structural system model. Developing and understanding structural system models is the first step in creating accurate predictive analytics for energy production. The associative network of the building's data will be highlighted to accurately depict the structural model. This structural model will enhance energy infrastructure systems' energy efficiency, reduce energy waste, and narrow the gaps between energy infrastructure design, planning, operation and management (DPOM).

ContributorsCamarena, Raquel Jimenez (Author) / Chong, Oswald (Thesis director) / Ye, Nong (Committee member) / Industrial, Systems (Contributor) / Barrett, The Honors College (Contributor)

Created2016-12

Open-Source Feature Selection Tool for Medical Imaging Diagnosis

Description

Open source image analytics and data mining software are widely available but can be overly-complicated and non-intuitive for medical physicians and researchers to use. The ASU-Mayo Clinic Imaging Informatics Lab has developed an in-house pipeline to process medical images, extract imaging features, and develop multi-parametric models to assist disease staging…

Open source image analytics and data mining software are widely available but can be overly-complicated and non-intuitive for medical physicians and researchers to use. The ASU-Mayo Clinic Imaging Informatics Lab has developed an in-house pipeline to process medical images, extract imaging features, and develop multi-parametric models to assist disease staging and diagnosis. The tools have been extensively used in a number of medical studies including brain tumor, breast cancer, liver cancer, Alzheimer's disease, and migraine. Recognizing the need from users in the medical field for a simplified interface and streamlined functionalities, this project aims to democratize this pipeline so that it is more readily available to health practitioners and third party developers.

ContributorsBaer, Lisa Zhou (Author) / Wu, Teresa (Thesis director) / Wang, Yalin (Committee member) / Computer Science and Engineering Program (Contributor) / W. P. Carey School of Business (Contributor) / Barrett, The Honors College (Contributor)

Created2016-12

The LEED Rating System and the International Green Construction Code: A Comparative Analysis of Green Building Design Approaches

Description

Building construction, design and maintenance is a sector of engineering where improved efficiency will have immense impacts on resource consumption and environmental health. This research closely examines the Leadership in Environment and Energy Design (LEED) rating system and the International Green Construction Code (IgCC). The IgCC is a model code,…

Building construction, design and maintenance is a sector of engineering where improved efficiency will have immense impacts on resource consumption and environmental health. This research closely examines the Leadership in Environment and Energy Design (LEED) rating system and the International Green Construction Code (IgCC). The IgCC is a model code, written with the same structure as many building codes. It is a standard that can be enforced if a city's government decides to adopt it. When IgCC is enforced, the buildings either meet all of the requirements set forth in the document or it fails to meet the code standards. The LEED Rating System, on the other hand, is not a building code. LEED certified buildings are built according to the standards of their local jurisdiction and in addition to that, building owners can chose to pursue a LEED certification. This is a rating system that awards points based on the sustainable measures achieved by a building. A comparison of these green building systems highlights their accomplishments in terms of reduced electricity usage, usage of low-impact materials, indoor environmental quality and other innovative features. It was determined that in general IgCC is more holistic, stringent approach to green building. At the same time the LEED rating system a wider variety of green building options. In addition, building data from LEED certified buildings was complied and analyzed to understand important trends. Both of these methods are progressing towards low-impact, efficient infrastructure and a side-by-side comparison, as done in this research, shed light on the strengths and weaknesses of each method, allowing for future improvements.

ContributorsCampbell, Kaleigh Ruth (Author) / Chong, Oswald (Thesis director) / Parrish, Kristen (Committee member) / Civil, Environmental and Sustainable Engineering Programs (Contributor) / Barrett, The Honors College (Contributor)

Created2016-05

Building Management System Integration: Energy Data Analytics

Description

This paper describes the research done to quantify the relationship between external air temperature and energy consumption and internal air temperature and energy consumption. The study was conducted on a LEED Gold certified building, College Avenue Commons, located on Arizona State University's Tempe campus. It includes information on the background…

This paper describes the research done to quantify the relationship between external air temperature and energy consumption and internal air temperature and energy consumption. The study was conducted on a LEED Gold certified building, College Avenue Commons, located on Arizona State University's Tempe campus. It includes information on the background of previous studies in the area, some that agree with the research hypotheses and some that take a different path. Real-time data was collected hourly for energy consumption and external air temperature. Intermittent internal air temperature was collected by undergraduate researcher, Charles Banke. Regression analysis was used to prove two research hypotheses. The authors found no correlation between external air temperature and energy consumption, nor did they find a relationship between internal air temperature and energy consumption. This paper also includes recommendations for future work to improve the study.

ContributorsBanke, Charles Michael (Author) / Chong, Oswald (Thesis director) / Parrish, Kristen (Committee member) / Mechanical and Aerospace Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

Saliency cut: an automatic approach for video object segmentation based on saliency energy minimization

Description

Video object segmentation (VOS) is an important task in computer vision with a lot of applications, e.g., video editing, object tracking, and object based encoding. Different from image object segmentation, video object segmentation must consider both spatial and temporal coherence for the object. Despite extensive previous work, the problem is…

Video object segmentation (VOS) is an important task in computer vision with a lot of applications, e.g., video editing, object tracking, and object based encoding. Different from image object segmentation, video object segmentation must consider both spatial and temporal coherence for the object. Despite extensive previous work, the problem is still challenging. Usually, foreground object in the video draws more attention from humans, i.e. it is salient. In this thesis we tackle the problem from the aspect of saliency, where saliency means a certain subset of visual information selected by a visual system (human or machine). We present a novel unsupervised method for video object segmentation that considers both low level vision cues and high level motion cues. In our model, video object segmentation can be formulated as a unified energy minimization problem and solved in polynomial time by employing the min-cut algorithm. Specifically, our energy function comprises the unary term and pair-wise interaction energy term respectively, where unary term measures region saliency and interaction term smooths the mutual effects between object saliency and motion saliency. Object saliency is computed in spatial domain from each discrete frame using multi-scale context features, e.g., color histogram, gradient, and graph based manifold ranking. Meanwhile, motion saliency is calculated in temporal domain by extracting phase information of the video. In the experimental section of this thesis, our proposed method has been evaluated on several benchmark datasets. In MSRA 1000 dataset the result demonstrates that our spatial object saliency detection is superior to the state-of-art methods. Moreover, our temporal motion saliency detector can achieve better performance than existing motion detection approaches in UCF sports action analysis dataset and Weizmann dataset respectively. Finally, we show the attractive empirical result and quantitative evaluation of our approach on two benchmark video object segmentation datasets.

ContributorsWang, Yilin (Author) / Li, Baoxin (Thesis advisor) / Wang, Yalin (Committee member) / Cleveau, David (Committee member) / Arizona State University (Publisher)

Created2013

Machine learning methods for high-dimensional imbalanced biomedical data

Description

Learning from high dimensional biomedical data attracts lots of attention recently. High dimensional biomedical data often suffer from the curse of dimensionality and have imbalanced class distributions. Both of these features of biomedical data, high dimensionality and imbalanced class distributions, are challenging for traditional machine learning methods and may affect…

Learning from high dimensional biomedical data attracts lots of attention recently. High dimensional biomedical data often suffer from the curse of dimensionality and have imbalanced class distributions. Both of these features of biomedical data, high dimensionality and imbalanced class distributions, are challenging for traditional machine learning methods and may affect the model performance. In this thesis, I focus on developing learning methods for the high-dimensional imbalanced biomedical data. In the first part, a sparse canonical correlation analysis (CCA) method is presented. The penalty terms is used to control the sparsity of the projection matrices of CCA. The sparse CCA method is then applied to find patterns among biomedical data sets and labels, or to find patterns among different data sources. In the second part, I discuss several learning problems for imbalanced biomedical data. Note that traditional learning systems are often biased when the biomedical data are imbalanced. Therefore, traditional evaluations such as accuracy may be inappropriate for such cases. I then discuss several alternative evaluation criteria to evaluate the learning performance. For imbalanced binary classification problems, I use the undersampling based classifiers ensemble (UEM) strategy to obtain accurate models for both classes of samples. A small sphere and large margin (SSLM) approach is also presented to detect rare abnormal samples from a large number of subjects. In addition, I apply multiple feature selection and clustering methods to deal with high-dimensional data and data with highly correlated features. Experiments on high-dimensional imbalanced biomedical data are presented which illustrate the effectiveness and efficiency of my methods.

ContributorsYang, Tao (Author) / Ye, Jieping (Thesis advisor) / Wang, Yalin (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)

Created2013

Filtering by