Matching Items (9)
Filtering by

Clear all filters

151689-Thumbnail Image.png
Description
Sparsity has become an important modeling tool in areas such as genetics, signal and audio processing, medical image processing, etc. Via the penalization of l-1 norm based regularization, the structured sparse learning algorithms can produce highly accurate models while imposing various predefined structures on the data, such as feature groups

Sparsity has become an important modeling tool in areas such as genetics, signal and audio processing, medical image processing, etc. Via the penalization of l-1 norm based regularization, the structured sparse learning algorithms can produce highly accurate models while imposing various predefined structures on the data, such as feature groups or graphs. In this thesis, I first propose to solve a sparse learning model with a general group structure, where the predefined groups may overlap with each other. Then, I present three real world applications which can benefit from the group structured sparse learning technique. In the first application, I study the Alzheimer's Disease diagnosis problem using multi-modality neuroimaging data. In this dataset, not every subject has all data sources available, exhibiting an unique and challenging block-wise missing pattern. In the second application, I study the automatic annotation and retrieval of fruit-fly gene expression pattern images. Combined with the spatial information, sparse learning techniques can be used to construct effective representation of the expression images. In the third application, I present a new computational approach to annotate developmental stage for Drosophila embryos in the gene expression images. In addition, it provides a stage score that enables one to more finely annotate each embryo so that they are divided into early and late periods of development within standard stage demarcations. Stage scores help us to illuminate global gene activities and changes much better, and more refined stage annotations improve our ability to better interpret results when expression pattern matches are discovered between genes.
ContributorsYuan, Lei (Author) / Ye, Jieping (Thesis advisor) / Wang, Yalin (Committee member) / Xue, Guoliang (Committee member) / Kumar, Sudhir (Committee member) / Arizona State University (Publisher)
Created2013
150510-Thumbnail Image.png
Description
Postnatal skeletal muscle repair is dependent on the tight regulation of an adult stem cell population known as satellite cells. In response to injury, these quiescent cells are activated, proliferate and express skeletal muscle-specific genes. The majority of satellite cells will fuse to damaged fibers or form new muscle fibers,

Postnatal skeletal muscle repair is dependent on the tight regulation of an adult stem cell population known as satellite cells. In response to injury, these quiescent cells are activated, proliferate and express skeletal muscle-specific genes. The majority of satellite cells will fuse to damaged fibers or form new muscle fibers, while a subset will return to a quiescent state, where they are available for future rounds of repair. Robust muscle repair is dependent on the signals that regulate the mutually exclusive decisions of differentiation and self-renewal. A likely candidate for regulating this process is NUMB, an inhibitor of Notch signaling pathway that has been shown to asymmetrically localize in daughter cells undergoing cell fate decisions. In order to study the role of this protein in muscle repair, an inducible knockout of Numb was made in mice. Numb deficient muscle had a defective repair response to acute induced damage as characterized by smaller myofibers, increased collagen deposition and infiltration of fibrotic cells. Satellite cells isolated from Numb-deficient mice show decreased proliferation rates. Subsequent analyses of gene expression demonstrated that these cells had an aberrantly up-regulated Myostatin (Mstn), an inhibitor of myoblast proliferation. Further, this defect could be rescued with Mstn specific siRNAs. These data indicate that NUMB is necessary for postnatal muscle repair and early proliferative expansion of satellite cells. We used an evolutionary compatible to examine processes controlling satellite cell fate decisions, primary satellite cell lines were generated from Anolis carolinensis. This green anole lizard is evolutionarily the closet animal to mammals that forms de novo muscle tissue while undergoing tail regeneration. The mechanism of regeneration in anoles and the sources of stem cells for skeletal muscle, cartilage and nerves are poorly understood. Thus, satellite cells were isolated from A. carolinensis and analyzed for their plasticity. Anole satellite cells show increased plasticity as compared to mouse as determined by expression of key markers specific for bone and cartilage without administration of exogenous morphogens. These novel data suggest that satellite cells might contribute to more than muscle in tail regeneration of A. carolinensis.
ContributorsGeorge, Rajani M (Author) / Wilson-Rawls, Jeanne (Thesis advisor) / Rawls, Alan (Committee member) / Whitfield, Kerr (Committee member) / Kusumi, Kenro (Committee member) / Arizona State University (Publisher)
Created2012
137233-Thumbnail Image.png
Description
While a number of vertebrates, including fishes, salamanders, frogs, and lizards, display regenerative capacity, the process is not necessarily the same. It has been proposed that regeneration, while evolutionarily conserved, has diverged during evolution. However, the extent to which the mechanisms of regeneration have changed between taxa still remains elusive.

While a number of vertebrates, including fishes, salamanders, frogs, and lizards, display regenerative capacity, the process is not necessarily the same. It has been proposed that regeneration, while evolutionarily conserved, has diverged during evolution. However, the extent to which the mechanisms of regeneration have changed between taxa still remains elusive. In the salamander limb, cells dedifferentiate to a more plastic state and aggregate in the distal portion of the appendage to form a blastema, which is responsible for outgrowth and tissue development. In contrast, no such mechanism has been identified in lizards, and it is unclear to what extent evolutionary divergence between amniotes and anamniotes has altered this mechanism. Anolis carolinensis lizards are capable of regenerating their tails after stress-induced autotomy or self-amputation. In this investigation, the distribution of proliferating cells in early A. carolinensis tail regeneration was visualized by immunohistochemistry to examine the location and quantity of proliferating cells. An aggregate of proliferating cells at the distal region of the regenerate is considered indicative of blastema formation. Proliferating cell nuclear antigen (PCNA) and minichromosome maintenance complex component 2 (MCM2) were utilized as proliferation markers. Positive cells were counted for each tail (n=9, n=8 respectively). The percent of proliferating cells at the tip and base of the regenerating tail were compared with a one-way ANOVA statistical test. Both markers showed no significant difference (P=0.585, P=0.603 respectively) indicating absence of a blastema-like structure. These results suggest an alternative mechanism of regeneration in lizards and potentially other amniotes.
ContributorsTokuyama, Minami Adrianne (Author) / Kusumi, Kenro (Thesis director) / Wilson-Rawls, Jeanne (Committee member) / Menke, Douglas (Committee member) / Barrett, The Honors College (Contributor) / Department of Chemistry and Biochemistry (Contributor) / School of Life Sciences (Contributor)
Created2014-05
153689-Thumbnail Image.png
Description
Damage to the central nervous system due to spinal cord or traumatic brain injury, as well as degenerative musculoskeletal disorders such as arthritis, drastically impact the quality of life. Regeneration of complex structures is quite limited in mammals, though other vertebrates possess this ability. Lizards are the most closely related

Damage to the central nervous system due to spinal cord or traumatic brain injury, as well as degenerative musculoskeletal disorders such as arthritis, drastically impact the quality of life. Regeneration of complex structures is quite limited in mammals, though other vertebrates possess this ability. Lizards are the most closely related organism to humans that can regenerate de novo skeletal muscle, hyaline cartilage, spinal cord, vasculature, and skin. Progress in studying the cellular and molecular mechanisms of lizard regeneration has previously been limited by a lack of genomic resources. Building on the release of the genome of the green anole, Anolis carolinensis, we developed a second generation, robust RNA-Seq-based genome annotation, and performed the first transcriptomic analysis of tail regeneration in this species. In order to investigate gene expression in regenerating tissue, we performed whole transcriptome and microRNA transcriptome analysis of regenerating tail tip and base and associated tissues, identifying key genetic targets in the regenerative process. These studies have identified components of a genetic program for regeneration in the lizard that includes both developmental and adult repair mechanisms shared with mammals, indicating value in the translation of these findings to future regenerative therapies.
ContributorsHutchins, Elizabeth (Author) / Kusumi, Kenro (Thesis advisor) / Rawls, Jeffrey A. (Committee member) / Denardo, Dale F. (Committee member) / Huentelman, Matthew J. (Committee member) / Arizona State University (Publisher)
Created2015
Description

Agassiz’s desert tortoise (Gopherus agassizii) is a long-lived species native to the Mojave Desert and is listed as threatened under the US Endangered Species Act. To aid conservation efforts for preserving the genetic diversity of this species, we generated a whole genome reference sequence with an annotation based on dee

Agassiz’s desert tortoise (Gopherus agassizii) is a long-lived species native to the Mojave Desert and is listed as threatened under the US Endangered Species Act. To aid conservation efforts for preserving the genetic diversity of this species, we generated a whole genome reference sequence with an annotation based on deep transcriptome sequences of adult skeletal muscle, lung, brain, and blood. The draft genome assembly for G. agassizii has a scaffold N50 length of 252 kbp and a total length of 2.4 Gbp. Genome annotation reveals 20,172 protein-coding genes in the G. agassizii assembly, and that gene structure is more similar to chicken than other turtles. We provide a series of comparative analyses demonstrating (1) that turtles are among the slowest-evolving genome-enabled reptiles, (2) amino acid changes in genes controlling desert tortoise traits such as shell development, longevity and osmoregulation, and (3) fixed variants across the Gopherus species complex in genes related to desert adaptations, including circadian rhythm and innate immune response. This G. agassizii genome reference and annotation is the first such resource for any tortoise, and will serve as a foundation for future analysis of the genetic basis of adaptations to the desert environment, allow for investigation into genomic factors affecting tortoise health, disease and longevity, and serve as a valuable resource for additional studies in this species complex.

Data Availability: All genomic and transcriptomic sequence files are available from the NIH-NCBI BioProject database (accession numbers PRJNA352725, PRJNA352726, and PRJNA281763). All genome assembly, transcriptome assembly, predicted protein, transcript, genome annotation, repeatmasker, phylogenetic trees, .vcf and GO enrichment files are available on Harvard Dataverse (doi:10.7910/DVN/EH2S9K).

ContributorsTollis, Marc (Author) / DeNardo, Dale F (Author) / Cornelius, John A (Author) / Dolby, Greer A (Author) / Edwards, Taylor (Author) / Henen, Brian T. (Author) / Karl, Alice E. (Author) / Murphy, Robert W. (Author) / Kusumi, Kenro (Author)
Created2017-05-31
153977-Thumbnail Image.png
Description
Rapid advancements in genomic technologies have increased our understanding of rare human disease. Generation of multiple types of biological data including genetic variation from genome or exome, expression from transcriptome, methylation patterns from epigenome, protein complexity from proteome and metabolite information from metabolome is feasible. "Omics" tools provide comprehensive view

Rapid advancements in genomic technologies have increased our understanding of rare human disease. Generation of multiple types of biological data including genetic variation from genome or exome, expression from transcriptome, methylation patterns from epigenome, protein complexity from proteome and metabolite information from metabolome is feasible. "Omics" tools provide comprehensive view into biological mechanisms that impact disease trait and risk. In spite of available data types and ability to collect them simultaneously from patients, researchers still rely on their independent analysis. Combining information from multiple biological data can reduce missing information, increase confidence in single data findings, and provide a more complete view of genotype-phenotype correlations. Although rare disease genetics has been greatly improved by exome sequencing, a substantial portion of clinical patients remain undiagnosed. Multiple frameworks for integrative analysis of genomic and transcriptomic data are presented with focus on identifying functional genetic variations in patients with undiagnosed, rare childhood conditions. Direct quantitation of X inactivation ratio was developed from genomic and transcriptomic data using allele specific expression and segregation analysis to determine magnitude and inheritance mode of X inactivation. This approach was applied in two families revealing non-random X inactivation in female patients. Expression based analysis of X inactivation showed high correlation with standard clinical assay. These findings improved understanding of molecular mechanisms underlying X-linked disorders. In addition multivariate outlier analysis of gene and exon level data from RNA-seq using Mahalanobis distance, and its integration of distance scores with genomic data found genotype-phenotype correlations in variant prioritization process in 25 families. Mahalanobis distance scores revealed variants with large transcriptional impact in patients. In this dataset, frameshift variants were more likely result in outlier expression signatures than other types of functional variants. Integration of outlier estimates with genetic variants corroborated previously identified, presumed causal variants and highlighted new candidate in previously un-diagnosed case. Integrative genomic approaches in easily attainable tissue will facilitate the search for biomarkers that impact disease trait, uncover pharmacogenomics targets, provide novel insight into molecular underpinnings of un-characterized conditions, and help improve analytical approaches that use large datasets.
ContributorsSzelinger, Szabolcs (Author) / Craig, David W. (Thesis advisor) / Kusumi, Kenro (Thesis advisor) / Narayan, Vinodh (Committee member) / Rosenberg, Michael S. (Committee member) / Huentelman, Matthew J (Committee member) / Arizona State University (Publisher)
Created2015
154028-Thumbnail Image.png
Description
In the U.S., breast cancer (BC) incidences among African American (AA) and CA (CA) women are similar, yet AA women have a significantly higher mortality rate. In addition, AA women often present with tumors at a younger age, with a higher tumor grade/stage and are more likely to be diagnosed

In the U.S., breast cancer (BC) incidences among African American (AA) and CA (CA) women are similar, yet AA women have a significantly higher mortality rate. In addition, AA women often present with tumors at a younger age, with a higher tumor grade/stage and are more likely to be diagnosed with the highly aggressive triple-negative breast cancer (TNBC) subtype. Even within the TNBC subtype, AA women have a worse clinical outcome compared to CA. Although multiple socio-economic and lifestyle factors may contribute to these observed health disparities, it is essential that the underlying biological differences between CA and AA TNBC are identified. In this study, gene expression profiling was performed on archived FFPE samples, obtained from CA and AA women diagnosed with early stage TNBC. Initial analysis revealed a pattern of differential expression in the AA cohort compared to CA. Further molecular characterization results showed that the AA cohort segregated into 3-TNBC molecular subtypes; Basal-like (BL2), Immunomodulatory (IM) and Mesenchymal (M). Gene expression analyses resulted in 190 differentially expressed genes between the AA and CA cohorts. Pathway enrichment analysis demonstrated that differentially expressed genes were over-represented in cytoskeletal remodeling, cell adhesion, tight junctions, and immune response in the AA TNBC -cohort. Furthermore, genes in the Wnt/β-catenin pathway were over-expressed. These results were validated using RT-qPCR on an independent cohort of FFPE samples from AA and CA women with early stage TNBC, and identified Caveolin-1 (CAV1) as being significantly expressed in the AA-TNBC cohort. Furthermore, CAV1 was shown to be highly expressed in a cell line panel of TNBC, in particular, those of the mesenchymal and basal-like molecular subtype. Finally, silencing of CAV1 expression by siRNA resulted in a significant decrease in proliferation in each of the TNBC cell lines. These observations suggest that CAV1 expression may contribute to the more aggressive phenotype observed in AA women diagnosed with TNBC.
ContributorsGetz, Julie (Author) / Baumbach-Reardon, Lisa L (Thesis advisor) / Lake, Douglas F (Thesis advisor) / Bussey, Kimberly (Committee member) / Kusumi, Kenro (Committee member) / Arizona State University (Publisher)
Created2015
154269-Thumbnail Image.png
Description
Understanding the complexity of temporal and spatial characteristics of gene expression over brain development is one of the crucial research topics in neuroscience. An accurate description of the locations and expression status of relative genes requires extensive experiment resources. The Allen Developing Mouse Brain Atlas provides a large number of

Understanding the complexity of temporal and spatial characteristics of gene expression over brain development is one of the crucial research topics in neuroscience. An accurate description of the locations and expression status of relative genes requires extensive experiment resources. The Allen Developing Mouse Brain Atlas provides a large number of in situ hybridization (ISH) images of gene expression over seven different mouse brain developmental stages. Studying mouse brain models helps us understand the gene expressions in human brains. This atlas collects about thousands of genes and now they are manually annotated by biologists. Due to the high labor cost of manual annotation, investigating an efficient approach to perform automated gene expression annotation on mouse brain images becomes necessary. In this thesis, a novel efficient approach based on machine learning framework is proposed. Features are extracted from raw brain images, and both binary classification and multi-class classification models are built with some supervised learning methods. To generate features, one of the most adopted methods in current research effort is to apply the bag-of-words (BoW) algorithm. However, both the efficiency and the accuracy of BoW are not outstanding when dealing with large-scale data. Thus, an augmented sparse coding method, which is called Stochastic Coordinate Coding, is adopted to generate high-level features in this thesis. In addition, a new multi-label classification model is proposed in this thesis. Label hierarchy is built based on the given brain ontology structure. Experiments have been conducted on the atlas and the results show that this approach is efficient and classifies the images with a relatively higher accuracy.
ContributorsZhao, Xinlin (Author) / Ye, Jieping (Thesis advisor) / Wang, Yalin (Thesis advisor) / Li, Baoxin (Committee member) / Arizona State University (Publisher)
Created2016
151402-Thumbnail Image.png
Description
Drosophila melanogaster, as an important model organism, is used to explore the mechanism which governs cell differentiation and embryonic development. Understanding the mechanism will help to reveal the effects of genes on other species or even human beings. Currently, digital camera techniques make high quality Drosophila gene expression imaging possible.

Drosophila melanogaster, as an important model organism, is used to explore the mechanism which governs cell differentiation and embryonic development. Understanding the mechanism will help to reveal the effects of genes on other species or even human beings. Currently, digital camera techniques make high quality Drosophila gene expression imaging possible. On the other hand, due to the advances in biology, gene expression images which can reveal spatiotemporal patterns are generated in a high-throughput pace. Thus, an automated and efficient system that can analyze gene expression will become a necessary tool for investigating the gene functions, interactions and developmental processes. One investigation method is to compare the expression patterns of different developmental stages. Recently, however, the expression patterns are manually annotated with rough stage ranges. The work of annotation requires professional knowledge from experienced biologists. Hence, how to transfer the domain knowledge in biology into an automated system which can automatically annotate the patterns provides a challenging problem for computer scientists. In this thesis, the problem of stage annotation for Drosophila embryo is modeled in the machine learning framework. Three sparse learning algorithms and one ensemble algorithm are used to attack the problem. The sparse algorithms are Lasso, group Lasso and sparse group Lasso. The ensemble algorithm is based on a voting method. Besides that the proposed algorithms can annotate the patterns to stages instead of stage ranges with high accuracy; the decimal stage annotation algorithm presents a novel way to annotate the patterns to decimal stages. In addition, some analysis on the algorithm performance are made and corresponding explanations are given. Finally, with the proposed system, all the lateral view BDGP and FlyFish images are annotated and several interesting applications of decimal stage value are revealed.
ContributorsPan, Cheng (Author) / Ye, Jieping (Thesis advisor) / Li, Baoxin (Committee member) / Farin, Gerald (Committee member) / Arizona State University (Publisher)
Created2012