Matching Items (76)
155389-Thumbnail Image.png
Description
Large-scale $\ell_1$-regularized loss minimization problems arise in high-dimensional applications such as compressed sensing and high-dimensional supervised learning, including classification and regression problems. In many applications, it remains challenging to apply the sparse learning model to large-scale problems that have massive data samples with high-dimensional features. One popular and promising strategy

Large-scale $\ell_1$-regularized loss minimization problems arise in high-dimensional applications such as compressed sensing and high-dimensional supervised learning, including classification and regression problems. In many applications, it remains challenging to apply the sparse learning model to large-scale problems that have massive data samples with high-dimensional features. One popular and promising strategy is to scaling up the optimization problem in parallel. Parallel solvers run multiple cores on a shared memory system or a distributed environment to speed up the computation, while the practical usage is limited by the huge dimension in the feature space and synchronization problems.

In this dissertation, I carry out the research along the direction with particular focuses on scaling up the optimization of sparse learning for supervised and unsupervised learning problems. For the supervised learning, I firstly propose an asynchronous parallel solver to optimize the large-scale sparse learning model in a multithreading environment. Moreover, I propose a distributed framework to conduct the learning process when the dataset is distributed stored among different machines. Then the proposed model is further extended to the studies of risk genetic factors for Alzheimer's Disease (AD) among different research institutions, integrating a group feature selection framework to rank the top risk SNPs for AD. For the unsupervised learning problem, I propose a highly efficient solver, termed Stochastic Coordinate Coding (SCC), scaling up the optimization of dictionary learning and sparse coding problems. The common issue for the medical imaging research is that the longitudinal features of patients among different time points are beneficial to study together. To further improve the dictionary learning model, I propose a multi-task dictionary learning method, learning the different task simultaneously and utilizing shared and individual dictionary to encode both consistent and changing imaging features.
ContributorsLi, Qingyang (Author) / Ye, Jieping (Thesis advisor) / Xue, Guoliang (Thesis advisor) / He, Jingrui (Committee member) / Wang, Yalin (Committee member) / Li, Jing (Committee member) / Arizona State University (Publisher)
Created2017
158676-Thumbnail Image.png
Description
The rapid development in acquiring multimodal neuroimaging data provides opportunities to systematically characterize human brain structures and functions. For example, in the brain magnetic resonance imaging (MRI), a typical non-invasive imaging technique, different acquisition sequences (modalities) lead to the different descriptions of brain functional activities, or anatomical biomarkers. Nowadays, in

The rapid development in acquiring multimodal neuroimaging data provides opportunities to systematically characterize human brain structures and functions. For example, in the brain magnetic resonance imaging (MRI), a typical non-invasive imaging technique, different acquisition sequences (modalities) lead to the different descriptions of brain functional activities, or anatomical biomarkers. Nowadays, in addition to the traditional voxel-level analysis of images, there is a trend to process and investigate the cross-modality relationship in a high dimensional level of images, e.g. surfaces and networks.

In this study, I aim to achieve multimodal brain image fusion by referring to some intrinsic properties of data, e.g. geometry of embedding structures where the commonly used image features reside. Since the image features investigated in this study share an identical embedding space, i.e. either defined on a brain surface or brain atlas, where a graph structure is easy to define, it is straightforward to consider the mathematically meaningful properties of the shared structures from the geometry perspective.

I first introduce the background of multimodal fusion of brain image data and insights of geometric properties playing a potential role to link different modalities. Then, several proposed computational frameworks either using the solid and efficient geometric algorithms or current geometric deep learning models are be fully discussed. I show how these designed frameworks deal with distinct geometric properties respectively, and their applications in the real healthcare scenarios, e.g. to enhanced detections of fetal brain diseases or abnormal brain development.
ContributorsZhang, Wen (Author) / Wang, Yalin (Thesis advisor) / Liu, Huan (Committee member) / Li, Baoxin (Committee member) / Braden, B. Blair (Committee member) / Arizona State University (Publisher)
Created2020
158291-Thumbnail Image.png
Description
This thesis introduces new techniques for clustering distributional data according to their geometric similarities. This work builds upon the optimal transportation (OT) problem that seeks global minimum cost for matching distributional data and leverages the connection between OT and power diagrams to solve different clustering problems. The OT formulation is

This thesis introduces new techniques for clustering distributional data according to their geometric similarities. This work builds upon the optimal transportation (OT) problem that seeks global minimum cost for matching distributional data and leverages the connection between OT and power diagrams to solve different clustering problems. The OT formulation is based on the variational principle to differentiate hard cluster assignments, which was missing in the literature. This thesis shows multiple techniques to regularize and generalize OT to cope with various tasks including clustering, aligning, and interpolating distributional data. It also discusses the connections of the new formulation to other OT and clustering formulations to better understand their gaps and the means to close them. Finally, this thesis demonstrates the advantages of the proposed OT techniques in solving machine learning problems and their downstream applications in computer graphics, computer vision, and image processing.
ContributorsMi, Liang (Author) / Wang, Yalin (Thesis advisor) / Chen, Kewei (Committee member) / Karam, Lina (Committee member) / Li, Baoxin (Committee member) / Turaga, Pavan (Committee member) / Arizona State University (Publisher)
Created2020
158811-Thumbnail Image.png
Description
Image super-resolution (SR) is a low-level image processing task, which has manyapplications such as medical imaging, satellite image processing, and video enhancement,
etc. Given a low resolution image, it aims to reconstruct a high resolution
image. The problem is ill-posed since there can be more than one high resolution
image corresponding to the

Image super-resolution (SR) is a low-level image processing task, which has manyapplications such as medical imaging, satellite image processing, and video enhancement,
etc. Given a low resolution image, it aims to reconstruct a high resolution
image. The problem is ill-posed since there can be more than one high resolution
image corresponding to the same low-resolution image. To address this problem, a
number of machine learning-based approaches have been proposed.
In this dissertation, I present my works on single image super-resolution (SISR)
and accelerated magnetic resonance imaging (MRI) (a.k.a. super-resolution on MR
images), followed by the investigation on transfer learning for accelerated MRI reconstruction.
For the SISR, a dictionary-based approach and two reconstruction based
approaches are presented. To be precise, a convex dictionary learning (CDL)
algorithm is proposed by constraining the dictionary atoms to be formed by nonnegative
linear combination of the training data, which is a natural, desired property.
Also, two reconstruction-based single methods are presented, which make use
of (i)the joint regularization, where a group-residual-based regularization (GRR) and
a ridge-regression-based regularization (3R) are combined; (ii)the collaborative representation
and non-local self-similarity. After that, two deep learning approaches
are proposed, aiming at reconstructing high-quality images from accelerated MRI
acquisition. Residual Dense Block (RDB) and feedback connection are introduced
in the proposed models. In the last chapter, the feasibility of transfer learning for
accelerated MRI reconstruction is discussed.
ContributorsDing, Pak Lun Kevin (Author) / Li, Baoxin (Thesis advisor) / Wu, Teresa (Committee member) / Wang, Yalin (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2020
158066-Thumbnail Image.png
Description
Recently, a well-designed and well-trained neural network can yield state-of-the-art results across many domains, including data mining, computer vision, and medical image analysis. But progress has been limited for tasks where labels are difficult or impossible to obtain. This reliance on exhaustive labeling is a critical limitation in the rapid

Recently, a well-designed and well-trained neural network can yield state-of-the-art results across many domains, including data mining, computer vision, and medical image analysis. But progress has been limited for tasks where labels are difficult or impossible to obtain. This reliance on exhaustive labeling is a critical limitation in the rapid deployment of neural networks. Besides, the current research scales poorly to a large number of unseen concepts and is passively spoon-fed with data and supervision.

To overcome the above data scarcity and generalization issues, in my dissertation, I first propose two unsupervised conventional machine learning algorithms, hyperbolic stochastic coding, and multi-resemble multi-target low-rank coding, to solve the incomplete data and missing label problem. I further introduce a deep multi-domain adaptation network to leverage the power of deep learning by transferring the rich knowledge from a large-amount labeled source dataset. I also invent a novel time-sequence dynamically hierarchical network that adaptively simplifies the network to cope with the scarce data.

To learn a large number of unseen concepts, lifelong machine learning enjoys many advantages, including abstracting knowledge from prior learning and using the experience to help future learning, regardless of how much data is currently available. Incorporating this capability and making it versatile, I propose deep multi-task weight consolidation to accumulate knowledge continuously and significantly reduce data requirements in a variety of domains. Inspired by the recent breakthroughs in automatically learning suitable neural network architectures (AutoML), I develop a nonexpansive AutoML framework to train an online model without the abundance of labeled data. This work automatically expands the network to increase model capability when necessary, then compresses the model to maintain the model efficiency.

In my current ongoing work, I propose an alternative method of supervised learning that does not require direct labels. This could utilize various supervision from an image/object as a target value for supervising the target tasks without labels, and it turns out to be surprisingly effective. The proposed method only requires few-shot labeled data to train, and can self-supervised learn the information it needs and generalize to datasets not seen during training.
ContributorsZhang, Jie (Author) / Wang, Yalin (Thesis advisor) / Liu, Huan (Committee member) / Stonnington, Cynthia (Committee member) / Liang, Jianming (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2020
129539-Thumbnail Image.png
Description

The apolipoprotein E (APOE) e4 allele is the most prevalent genetic risk factor for Alzheimer's disease (AD). Hippocampal volumes are generally smaller in AD patients carrying the e4 allele compared to e4 noncarriers. Here we examined the effect of APOE e4 on hippocampal morphometry in a large imaging database—the Alzheimer's

The apolipoprotein E (APOE) e4 allele is the most prevalent genetic risk factor for Alzheimer's disease (AD). Hippocampal volumes are generally smaller in AD patients carrying the e4 allele compared to e4 noncarriers. Here we examined the effect of APOE e4 on hippocampal morphometry in a large imaging database—the Alzheimer's Disease Neuroimaging Initiative (ADNI). We automatically segmented and constructed hippocampal surfaces from the baseline MR images of 725 subjects with known APOE genotype information including 167 with AD, 354 with mild cognitive impairment (MCI), and 204 normal controls. High-order correspondences between hippocampal surfaces were enforced across subjects with a novel inverse consistent surface fluid registration method. Multivariate statistics consisting of multivariate tensor-based morphometry (mTBM) and radial distance were computed for surface deformation analysis. Using Hotelling's T2 test, we found significant morphological deformation in APOE e4 carriers relative to noncarriers in the entire cohort as well as in the nondemented (pooled MCI and control) subjects, affecting the left hippocampus more than the right, and this effect was more pronounced in e4 homozygotes than heterozygotes. Our findings are consistent with previous studies that showed e4 carriers exhibit accelerated hippocampal atrophy; we extend these findings to a novel measure of hippocampal morphometry. Hippocampal morphometry has significant potential as an imaging biomarker of early stage AD.

ContributorsShi, Jie (Author) / Lepore, Natasha (Author) / Gutman, Boris A. (Author) / Thompson, Paul M. (Author) / Baxter, Leslie C. (Author) / Caselli, Richard J. (Author) / Wang, Yalin (Author) / Ira A. Fulton Schools of Engineering (Contributor)
Created2014-08-01
129465-Thumbnail Image.png
Description

Mild Cognitive Impairment (MCI) is a transitional stage between normal aging and dementia and people with MCI are at high risk of progression to dementia. MCI is attracting increasing attention, as it offers an opportunity to target the disease process during an early symptomatic stage. Structural magnetic resonance imaging (MRI)

Mild Cognitive Impairment (MCI) is a transitional stage between normal aging and dementia and people with MCI are at high risk of progression to dementia. MCI is attracting increasing attention, as it offers an opportunity to target the disease process during an early symptomatic stage. Structural magnetic resonance imaging (MRI) measures have been the mainstay of Alzheimer's disease (AD) imaging research, however, ventricular morphometry analysis remains challenging because of its complicated topological structure. Here we describe a novel ventricular morphometry system based on the hyperbolic Ricci flow method and tensor-based morphometry (TBM) statistics. Unlike prior ventricular surface parameterization methods, hyperbolic conformal parameterization is angle-preserving and does not have any singularities. Our system generates a one-to-one diffeomorphic mapping between ventricular surfaces with consistent boundary matching conditions. The TBM statistics encode a great deal of surface deformation information that could be inaccessible or overlooked by other methods. We applied our system to the baseline MRI scans of a set of MCI subjects from the Alzheimer's Disease Neuroimaging Initiative (ADNI: 71 MCI converters vs. 62 MCI stable). Although the combined ventricular area and volume features did not differ between the two groups, our fine-grained surface analysis revealed significant differences in the ventricular regions close to the temporal lobe and posterior cingulate, structures that are affected early in AD. Significant correlations were also detected between ventricular morphometry, neuropsychological measures, and a previously described imaging index based on fluorodeoxyglucose positron emission tomography (FDG-PET) scans. This novel ventricular morphometry method may offer a new and more sensitive approach to study preclinical and early symptomatic stage AD.

ContributorsShi, Jie (Author) / Stonnington, Cynthia M. (Author) / Thompson, Paul M. (Author) / Chen, Kewei (Author) / Gutman, Boris (Author) / Reschke, Cole (Author) / Baxter, Leslie C. (Author) / Reiman, Eric M. (Author) / Caselli, Richard J. (Author) / Wang, Yalin (Author) / Ira A. Fulton Schools of Engineering (Contributor)
Created2015-01-01
128691-Thumbnail Image.png
Description

Although emerging evidence indicates that deep-sea water contains an untapped reservoir of high metabolic and genetic diversity, this realm has not been studied well compared with surface sea water. The study provided the first integrated meta-genomic and -transcriptomic analysis of the microbial communities in deep-sea water of North Pacific Ocean.

Although emerging evidence indicates that deep-sea water contains an untapped reservoir of high metabolic and genetic diversity, this realm has not been studied well compared with surface sea water. The study provided the first integrated meta-genomic and -transcriptomic analysis of the microbial communities in deep-sea water of North Pacific Ocean. DNA/RNA amplifications and simultaneous metagenomic and metatranscriptomic analyses were employed to discover information concerning deep-sea microbial communities from four different deep-sea sites ranging from the mesopelagic to pelagic ocean. Within the prokaryotic community, bacteria is absolutely dominant (~90%) over archaea in both metagenomic and metatranscriptomic data pools. The emergence of archaeal phyla Crenarchaeota, Euryarchaeota, Thaumarchaeota, bacterial phyla Actinobacteria, Firmicutes, sub-phyla Betaproteobacteria, Deltaproteobacteria, and Gammaproteobacteria, and the decrease of bacterial phyla Bacteroidetes and Alphaproteobacteria are the main composition changes of prokaryotic communities in the deep-sea water, when compared with the reference Global Ocean Sampling Expedition (GOS) surface water. Photosynthetic Cyanobacteria exist in all four metagenomic libraries and two metatranscriptomic libraries. In Eukaryota community, decreased abundance of fungi and algae in deep sea was observed. RNA/DNA ratio was employed as an index to show metabolic activity strength of microbes in deep sea. Functional analysis indicated that deep-sea microbes are leading a defensive lifestyle.

ContributorsWu, Jieying (Author) / Gao, Weimin (Author) / Johnson, Roger (Author) / Zhang, Weiwen (Author) / Meldrum, Deirdre (Author) / Biodesign Institute (Contributor)
Created2013-10-11
129002-Thumbnail Image.png
Description

Background: The use of culture-independent nucleic acid techniques, such as ribosomal RNA gene cloning library analysis, has unveiled the tremendous microbial diversity that exists in natural environments. In sharp contrast to this great achievement is the current difficulty in cultivating the majority of bacterial species or phylotypes revealed by molecular approaches.

Background: The use of culture-independent nucleic acid techniques, such as ribosomal RNA gene cloning library analysis, has unveiled the tremendous microbial diversity that exists in natural environments. In sharp contrast to this great achievement is the current difficulty in cultivating the majority of bacterial species or phylotypes revealed by molecular approaches. Although recent new technologies such as metagenomics and metatranscriptomics can provide more functionality information about the microbial communities, it is still important to develop the capacity to isolate and cultivate individual microbial species or strains in order to gain a better understanding of microbial physiology and to apply isolates for various biotechnological applications.

Results: We have developed a new system to cultivate bacteria in an array of droplets. The key component of the system is the microbe observation and cultivation array (MOCA), which consists of a Petri dish that contains an array of droplets as cultivation chambers. MOCA exploits the dominance of surface tension in small amounts of liquid to spontaneously trap cells in well-defined droplets on hydrophilic patterns. During cultivation, the growth of the bacterial cells across the droplet array can be monitored using an automated microscope, which can produce a real-time record of the growth. When bacterial cells grow to a visible microcolony level in the system, they can be transferred using a micropipette for further cultivation or analysis.

Conclusions: MOCA is a flexible system that is easy to set up, and provides the sensitivity to monitor growth of single bacterial cells. It is a cost-efficient technical platform for bioassay screening and for cultivation and isolation of bacteria from natural environments.

ContributorsGao, Weimin (Author) / Navarroli, Dena (Author) / Naimark, Jared (Author) / Zhang, Weiwen (Author) / Chao, Shih-hui (Author) / Meldrum, Deirdre (Author) / Biodesign Institute (Contributor)
Created2013-01-09
129070-Thumbnail Image.png
Description

Background: Heterogeneity within cell populations is relevant to the onset and progression of disease, as well as development and maintenance of homeostasis. Analysis and understanding of the roles of heterogeneity in biological systems require methods and technologies that are capable of single cell resolution. Single cell gene expression analysis by RT-qPCR

Background: Heterogeneity within cell populations is relevant to the onset and progression of disease, as well as development and maintenance of homeostasis. Analysis and understanding of the roles of heterogeneity in biological systems require methods and technologies that are capable of single cell resolution. Single cell gene expression analysis by RT-qPCR is an established technique for identifying transcriptomic heterogeneity in cellular populations, but it generally requires specialized equipment or tedious manipulations for cell isolation.

Results: We describe the optimization of a simple, inexpensive and rapid pipeline which includes isolation and culture of live single cells as well as fluorescence microscopy and gene expression analysis of the same single cells by RT-qPCR. We characterize the efficiency of single cell isolation and demonstrate our method by identifying single GFP-expressing cells from a mixed population of GFP-positive and negative cells by correlating fluorescence microscopy and RT-qPCR.

Conclusions: Single cell gene expression analysis by RT-qPCR is a convenient means for investigating cellular heterogeneity, but is most useful when correlating observations with additional measurements. We demonstrate a convenient and simple pipeline for multiplexing single cell RT-qPCR with fluorescence microscopy which is adaptable to other molecular analyses.

ContributorsYaron, Jordan (Author) / Ziegler, Colleen (Author) / Tran, Thai (Author) / Glenn, Honor (Author) / Meldrum, Deirdre (Author) / Biodesign Institute (Contributor)
Created2014-05-08