Theses and Dissertations
Displaying 1 - 2 of 2
Filtering by
- Creators: Liu, Huan
Description
While techniques for reading DNA in some capacity has been possible for decades,
the ability to accurately edit genomes at scale has remained elusive. Novel techniques
have been introduced recently to aid in the writing of DNA sequences. While writing
DNA is more accessible, it still remains expensive, justifying the increased interest in
in silico predictions of cell behavior. In order to accurately predict the behavior of
cells it is necessary to extensively model the cell environment, including gene-to-gene
interactions as completely as possible.
Significant algorithmic advances have been made for identifying these interactions,
but despite these improvements current techniques fail to infer some edges, and
fail to capture some complexities in the network. Much of this limitation is due to
heavily underdetermined problems, whereby tens of thousands of variables are to be
inferred using datasets with the power to resolve only a small fraction of the variables.
Additionally, failure to correctly resolve gene isoforms using short reads contributes
significantly to noise in gene quantification measures.
This dissertation introduces novel mathematical models, machine learning techniques,
and biological techniques to solve the problems described above. Mathematical
models are proposed for simulation of gene network motifs, and raw read simulation.
Machine learning techniques are shown for DNA sequence matching, and DNA
sequence correction.
Results provide novel insights into the low level functionality of gene networks. Also
shown is the ability to use normalization techniques to aggregate data for gene network
inference leading to larger data sets while minimizing increases in inter-experimental
noise. Results also demonstrate that high error rates experienced by third generation
sequencing are significantly different than previous error profiles, and that these errors can be modeled, simulated, and rectified. Finally, techniques are provided for amending this DNA error that preserve the benefits of third generation sequencing.
the ability to accurately edit genomes at scale has remained elusive. Novel techniques
have been introduced recently to aid in the writing of DNA sequences. While writing
DNA is more accessible, it still remains expensive, justifying the increased interest in
in silico predictions of cell behavior. In order to accurately predict the behavior of
cells it is necessary to extensively model the cell environment, including gene-to-gene
interactions as completely as possible.
Significant algorithmic advances have been made for identifying these interactions,
but despite these improvements current techniques fail to infer some edges, and
fail to capture some complexities in the network. Much of this limitation is due to
heavily underdetermined problems, whereby tens of thousands of variables are to be
inferred using datasets with the power to resolve only a small fraction of the variables.
Additionally, failure to correctly resolve gene isoforms using short reads contributes
significantly to noise in gene quantification measures.
This dissertation introduces novel mathematical models, machine learning techniques,
and biological techniques to solve the problems described above. Mathematical
models are proposed for simulation of gene network motifs, and raw read simulation.
Machine learning techniques are shown for DNA sequence matching, and DNA
sequence correction.
Results provide novel insights into the low level functionality of gene networks. Also
shown is the ability to use normalization techniques to aggregate data for gene network
inference leading to larger data sets while minimizing increases in inter-experimental
noise. Results also demonstrate that high error rates experienced by third generation
sequencing are significantly different than previous error profiles, and that these errors can be modeled, simulated, and rectified. Finally, techniques are provided for amending this DNA error that preserve the benefits of third generation sequencing.
ContributorsFaucon, Philippe Christophe (Author) / Liu, Huan (Thesis advisor) / Wang, Xiao (Committee member) / Crook, Sharon M (Committee member) / Wang, Yalin (Committee member) / Sarjoughian, Hessam S. (Committee member) / Arizona State University (Publisher)
Created2017
Description
The rapid growth of social media in recent years provides a large amount of user-generated visual objects, e.g., images and videos. Advanced semantic understanding approaches on such visual objects are desired to better serve applications such as human-machine interaction, image retrieval, etc. Semantic visual attributes have been proposed and utilized in multiple visual computing tasks to bridge the so-called "semantic gap" between extractable low-level feature representations and high-level semantic understanding of the visual objects.
Despite years of research, there are still some unsolved problems on semantic attribute learning. First, real-world applications usually involve hundreds of attributes which requires great effort to acquire sufficient amount of labeled data for model learning. Second, existing attribute learning work for visual objects focuses primarily on images, with semantic analysis on videos left largely unexplored.
In this dissertation I conduct innovative research and propose novel approaches to tackling the aforementioned problems. In particular, I propose robust and accurate learning frameworks on both attribute ranking and prediction by exploring the correlation among multiple attributes and utilizing various types of label information. Furthermore, I propose a video-based skill coaching framework by extending attribute learning to the video domain for robust motion skill analysis. Experiments on various types of applications and datasets and comparisons with multiple state-of-the-art baseline approaches confirm that my proposed approaches can achieve significant performance improvements for the general attribute learning problem.
Despite years of research, there are still some unsolved problems on semantic attribute learning. First, real-world applications usually involve hundreds of attributes which requires great effort to acquire sufficient amount of labeled data for model learning. Second, existing attribute learning work for visual objects focuses primarily on images, with semantic analysis on videos left largely unexplored.
In this dissertation I conduct innovative research and propose novel approaches to tackling the aforementioned problems. In particular, I propose robust and accurate learning frameworks on both attribute ranking and prediction by exploring the correlation among multiple attributes and utilizing various types of label information. Furthermore, I propose a video-based skill coaching framework by extending attribute learning to the video domain for robust motion skill analysis. Experiments on various types of applications and datasets and comparisons with multiple state-of-the-art baseline approaches confirm that my proposed approaches can achieve significant performance improvements for the general attribute learning problem.
ContributorsChen, Lin (Author) / Li, Baoxin (Thesis advisor) / Turaga, Pavan (Committee member) / Wang, Yalin (Committee member) / Liu, Huan (Committee member) / Arizona State University (Publisher)
Created2016