Matching Items (3)
Filtering by

Clear all filters

151689-Thumbnail Image.png
Description
Sparsity has become an important modeling tool in areas such as genetics, signal and audio processing, medical image processing, etc. Via the penalization of l-1 norm based regularization, the structured sparse learning algorithms can produce highly accurate models while imposing various predefined structures on the data, such as feature groups

Sparsity has become an important modeling tool in areas such as genetics, signal and audio processing, medical image processing, etc. Via the penalization of l-1 norm based regularization, the structured sparse learning algorithms can produce highly accurate models while imposing various predefined structures on the data, such as feature groups or graphs. In this thesis, I first propose to solve a sparse learning model with a general group structure, where the predefined groups may overlap with each other. Then, I present three real world applications which can benefit from the group structured sparse learning technique. In the first application, I study the Alzheimer's Disease diagnosis problem using multi-modality neuroimaging data. In this dataset, not every subject has all data sources available, exhibiting an unique and challenging block-wise missing pattern. In the second application, I study the automatic annotation and retrieval of fruit-fly gene expression pattern images. Combined with the spatial information, sparse learning techniques can be used to construct effective representation of the expression images. In the third application, I present a new computational approach to annotate developmental stage for Drosophila embryos in the gene expression images. In addition, it provides a stage score that enables one to more finely annotate each embryo so that they are divided into early and late periods of development within standard stage demarcations. Stage scores help us to illuminate global gene activities and changes much better, and more refined stage annotations improve our ability to better interpret results when expression pattern matches are discovered between genes.
ContributorsYuan, Lei (Author) / Ye, Jieping (Thesis advisor) / Wang, Yalin (Committee member) / Xue, Guoliang (Committee member) / Kumar, Sudhir (Committee member) / Arizona State University (Publisher)
Created2013
134156-Thumbnail Image.png
Description
Vitellogenin (vg) is a precursor protein of egg yolk in honeybees, but it is also known to have immunological functions. The purpose of this experiment was to determine the effect of vg on the viral load of deformed wing virus (DWV) in worker honey bees (Apis mellifera). I hypothesized that

Vitellogenin (vg) is a precursor protein of egg yolk in honeybees, but it is also known to have immunological functions. The purpose of this experiment was to determine the effect of vg on the viral load of deformed wing virus (DWV) in worker honey bees (Apis mellifera). I hypothesized that a reduction in vg expression would lead to an increase in the viral load. I collected 180 worker bees and split them into four groups: half the bees were subjected to a vg gene knockdown by injections of double stranded vg RNA, and the rest were injected with green fluorescent protein (gfp) double stranded RNA. Half of each group was thereafter injected with DWV, and half given a sham injection. The rate of mortality in all four groups was higher than expected, leaving only 17 bees total. I dissected these bees' fat bodies and extracted their RNA to test for vg and DWV. PCR results showed that, out of the small group of remaining bees, the levels of vg were not statistically different. Furthermore, both groups of virus-injected bees showed similar viral loads. Because of the high mortality rate bees and the lack of differing levels of vg transcript between experimental and control groups, I could not draw conclusions from these results. The high mortality could be caused by several factors: temperature-induced stress, repeated stress from the two injections, and stress from viral infection. In addition, it is possible that the vg dsRNA batch I used was faulty. This thesis exemplifies that information cannot safely be extracted when loss of sampling units result in a small datasets that do not represent the original sampling population.
ContributorsCrable, Emma Lewis (Author) / Amdam, Gro (Thesis director) / Wang, Ying (Committee member) / Dahan, Romain (Committee member) / School of Life Sciences (Contributor) / Barrett, The Honors College (Contributor)
Created2017-12
154269-Thumbnail Image.png
Description
Understanding the complexity of temporal and spatial characteristics of gene expression over brain development is one of the crucial research topics in neuroscience. An accurate description of the locations and expression status of relative genes requires extensive experiment resources. The Allen Developing Mouse Brain Atlas provides a large number of

Understanding the complexity of temporal and spatial characteristics of gene expression over brain development is one of the crucial research topics in neuroscience. An accurate description of the locations and expression status of relative genes requires extensive experiment resources. The Allen Developing Mouse Brain Atlas provides a large number of in situ hybridization (ISH) images of gene expression over seven different mouse brain developmental stages. Studying mouse brain models helps us understand the gene expressions in human brains. This atlas collects about thousands of genes and now they are manually annotated by biologists. Due to the high labor cost of manual annotation, investigating an efficient approach to perform automated gene expression annotation on mouse brain images becomes necessary. In this thesis, a novel efficient approach based on machine learning framework is proposed. Features are extracted from raw brain images, and both binary classification and multi-class classification models are built with some supervised learning methods. To generate features, one of the most adopted methods in current research effort is to apply the bag-of-words (BoW) algorithm. However, both the efficiency and the accuracy of BoW are not outstanding when dealing with large-scale data. Thus, an augmented sparse coding method, which is called Stochastic Coordinate Coding, is adopted to generate high-level features in this thesis. In addition, a new multi-label classification model is proposed in this thesis. Label hierarchy is built based on the given brain ontology structure. Experiments have been conducted on the atlas and the results show that this approach is efficient and classifies the images with a relatively higher accuracy.
ContributorsZhao, Xinlin (Author) / Ye, Jieping (Thesis advisor) / Wang, Yalin (Thesis advisor) / Li, Baoxin (Committee member) / Arizona State University (Publisher)
Created2016