Search Content

Developing alternative genetic systems for structural DNA nanotechnology and Darwinian evolution

Description

A major goal of synthetic biology is to recapitulate emergent properties of life. Despite a significant body of work, a longstanding question that remains to be answered is how such a complex system arose? In this dissertation, synthetic nucleic acid molecules with alternative sugar-phosphate backbones were investigated as potential ancestors…

A major goal of synthetic biology is to recapitulate emergent properties of life. Despite a significant body of work, a longstanding question that remains to be answered is how such a complex system arose? In this dissertation, synthetic nucleic acid molecules with alternative sugar-phosphate backbones were investigated as potential ancestors of DNA and RNA. Threose nucleic acid (TNA) is capable of forming stable helical structures with complementary strands of itself and RNA. This provides a plausible mechanism for genetic information transfer between TNA and RNA. Therefore TNA has been proposed as a potential RNA progenitor. Using molecular evolution, functional sequences were isolated from a pool of random TNA molecules. This implicates a possible chemical framework capable of crosstalk between TNA and RNA. Further, this shows that heredity and evolution are not limited to the natural genetic system based on ribofuranosyl nucleic acids. Another alternative genetic system, glycerol nucleic acid (GNA) undergoes intrasystem pairing with superior thermalstability compared to that of DNA. Inspired by this property, I demonstrated a minimal nanostructure composed of both left- and right-handed mirro image GNA. This work suggested that GNA could be useful as promising orthogonal material in structural DNA nanotechnology.

ContributorsZhang, Su (Author) / Chaut, John C (Thesis advisor) / Ghirlanda, Giovanna (Committee member) / Yan, Hao (Committee member) / Arizona State University (Publisher)

Created2011

An exploration of statistical modelling methods on simulation data case study: biomechanical predator-prey simulations

Description

Modern, advanced statistical tools from data mining and machine learning have become commonplace in molecular biology in large part because of the “big data” demands of various kinds of “-omics” (e.g., genomics, transcriptomics, metabolomics, etc.). However, in other fields of biology where empirical data sets are conventionally smaller, more…

Modern, advanced statistical tools from data mining and machine learning have become commonplace in molecular biology in large part because of the “big data” demands of various kinds of “-omics” (e.g., genomics, transcriptomics, metabolomics, etc.). However, in other fields of biology where empirical data sets are conventionally smaller, more traditional statistical methods of inference are still very effective and widely used. Nevertheless, with the decrease in cost of high-performance computing, these fields are starting to employ simulation models to generate insights into questions that have been elusive in the laboratory and field. Although these computational models allow for exquisite control over large numbers of parameters, they also generate data at a qualitatively different scale than most experts in these fields are accustomed to. Thus, more sophisticated methods from big-data statistics have an opportunity to better facilitate the often-forgotten area of bioinformatics that might be called “in-silicomics”.

As a case study, this thesis develops methods for the analysis of large amounts of data generated from a simulated ecosystem designed to understand how mammalian biomechanics interact with environmental complexity to modulate the outcomes of predator–prey interactions. These simulations investigate how other biomechanical parameters relating to the agility of animals in predator–prey pairs are better predictors of pursuit outcomes. Traditional modelling techniques such as forward, backward, and stepwise variable selection are initially used to study these data, but the number of parameters and potentially relevant interaction effects render these methods impractical. Consequently, new modelling techniques such as LASSO regularization are used and compared to the traditional techniques in terms of accuracy and computational complexity. Finally, the splitting rules and instances in the leaves of classification trees provide the basis for future simulation with an economical number of additional runs. In general, this thesis shows the increased utility of these sophisticated statistical techniques with simulated ecological data compared to the approaches traditionally used in these fields. These techniques combined with methods from industrial Design of Experiments will help ecologists extract novel insights from simulations that combine habitat complexity, population structure, and biomechanics.

ContributorsSeto, Christian (Author) / Pavlic, Theodore (Thesis advisor) / Li, Jing (Committee member) / Yan, Hao (Committee member) / Arizona State University (Publisher)

Created2018

Bowties, barcodes, and DNA origami: a novel approach for paired-chain immune receptor repertoire analysis

Description

There are many biological questions that require single-cell analysis of gene sequences, including analysis of clonally distributed dimeric immunoreceptors on lymphocytes (T cells and B cells) and/or the accumulation of driver/accessory mutations in polyclonal tumors. Lysis of bulk cell populations results in mixing of gene sequences, making it impossible to…

There are many biological questions that require single-cell analysis of gene sequences, including analysis of clonally distributed dimeric immunoreceptors on lymphocytes (T cells and B cells) and/or the accumulation of driver/accessory mutations in polyclonal tumors. Lysis of bulk cell populations results in mixing of gene sequences, making it impossible to know which pairs of gene sequences originated from any particular cell and obfuscating analysis of rare sequences within large populations. Although current single-cell sorting technologies can be used to address some of these questions, such approaches are expensive, require specialized equipment, and lack the necessary high-throughput capacity for comprehensive analysis. Water-in-oil emulsion approaches for single cell sorting have been developed but droplet-based single-cell lysis and analysis have proven inefficient and yield high rates of false pairings. Ideally, molecular approaches for linking gene sequences from individual cells could be coupled with next-generation high-throughput sequencing to overcome these obstacles, but conventional approaches for linking gene sequences, such as by transfection with bridging oligonucleotides, result in activation of cellular nucleases that destroy the template, precluding this strategy. Recent advances in the synthesis and fabrication of modular deoxyribonucleic acid (DNA) origami nanostructures have resulted in new possibilities for addressing many current and long-standing scientific and technical challenges in biology and medicine. One exciting application of DNA nanotechnology is the intracellular capture, barcode linkage, and subsequent sequence analysis of multiple messenger RNA (mRNA) targets from individual cells within heterogeneous cell populations. DNA nanostructures can be transfected into individual cells to capture and protect mRNA for specific expressed genes, and incorporation of origami-specific bowtie-barcodes into the origami nanostructure facilitates pairing and analysis of mRNA from individual cells by high-throughput next-generation sequencing. This approach is highly modular and can be adapted to virtually any two (and possibly more) gene target sequences, and therefore has a wide range of potential applications for analysis of diverse cell populations such as understanding the relationship between different immune cell populations, development of novel immunotherapeutic antibodies, or improving the diagnosis or treatment for a wide variety of cancers.

ContributorsSchoettle, Louis (Author) / Blattman, Joseph N (Thesis advisor) / Yan, Hao (Committee member) / Chang, Yung (Committee member) / Lindsay, Stuart (Committee member) / Arizona State University (Publisher)

Created2017

New statistical transfer learning models for health care applications

Description

Transfer learning is a sub-field of statistical modeling and machine learning. It refers to methods that integrate the knowledge of other domains (called source domains) and the data of the target domain in a mathematically rigorous and intelligent way, to develop a better model for the target domain than a…

Transfer learning is a sub-field of statistical modeling and machine learning. It refers to methods that integrate the knowledge of other domains (called source domains) and the data of the target domain in a mathematically rigorous and intelligent way, to develop a better model for the target domain than a model using the data of the target domain alone. While transfer learning is a promising approach in various application domains, my dissertation research focuses on the particular application in health care, including telemonitoring of Parkinson’s Disease (PD) and radiomics for glioblastoma.

The first topic is a Mixed Effects Transfer Learning (METL) model that can flexibly incorporate mixed effects and a general-form covariance matrix to better account for similarity and heterogeneity across subjects. I further develop computationally efficient procedures to handle unknown parameters and large covariance structures. Domain relations, such as domain similarity and domain covariance structure, are automatically quantified in the estimation steps. I demonstrate METL in an application of smartphone-based telemonitoring of PD.

The second topic focuses on an MRI-based transfer learning algorithm for non-invasive surgical guidance of glioblastoma patients. Limited biopsy samples per patient create a challenge to build a patient-specific model for glioblastoma. A transfer learning framework helps to leverage other patient’s knowledge for building a better predictive model. When modeling a target patient, not every patient’s information is helpful. Deciding the subset of other patients from which to transfer information to the modeling of the target patient is an important task to build an accurate predictive model. I define the subset of “transferrable” patients as those who have a positive rCBV-cell density correlation, because a positive correlation is confirmed by imaging theory and the its respective literature.

The last topic is a Privacy-Preserving Positive Transfer Learning (P3TL) model. Although negative transfer has been recognized as an important issue by the transfer learning research community, there is a lack of theoretical studies in evaluating the risk of negative transfer for a transfer learning method and identifying what causes the negative transfer. My work addresses this issue. Driven by the theoretical insights, I extend Bayesian Parameter Transfer (BPT) to a new method, i.e., P3TL. The unique features of P3TL include intelligent selection of patients to transfer in order to avoid negative transfer and maintain patient privacy. These features make P3TL an excellent model for telemonitoring of PD using an At-Home Testing Device.

ContributorsYoon, Hyunsoo (Author) / Li, Jing (Thesis advisor) / Wu, Teresa (Committee member) / Yan, Hao (Committee member) / Hu, Leland S. (Committee member) / Arizona State University (Publisher)

Created2018

Fast forward and inverse wave propagation for tomographic imaging of defects in solids

Description

Aging-related damage and failure in structures, such as fatigue cracking, corrosion, and delamination, are critical for structural integrity. Most engineering structures have embedded defects such as voids, cracks, inclusions from manufacturing. The properties and locations of embedded defects are generally unknown and hard to detect in complex engineering structures.…

Aging-related damage and failure in structures, such as fatigue cracking, corrosion, and delamination, are critical for structural integrity. Most engineering structures have embedded defects such as voids, cracks, inclusions from manufacturing. The properties and locations of embedded defects are generally unknown and hard to detect in complex engineering structures. Therefore, early detection of damage is beneficial for prognosis and risk management of aging infrastructure system.

Non-destructive testing (NDT) and structural health monitoring (SHM) are widely used for this purpose. Different types of NDT techniques have been proposed for the damage detection, such as optical image, ultrasound wave, thermography, eddy current, and microwave. The focus in this study is on the wave-based detection method, which is grouped into two major categories: feature-based damage detection and model-assisted damage detection. Both damage detection approaches have their own pros and cons. Feature-based damage detection is usually very fast and doesn’t involve in the solution of the physical model. The key idea is the dimension reduction of signals to achieve efficient damage detection. The disadvantage is that the loss of information due to the feature extraction can induce significant uncertainties and reduces the resolution. The resolution of the feature-based approach highly depends on the sensing path density. Model-assisted damage detection is on the opposite side. Model-assisted damage detection has the ability for high resolution imaging with limited number of sensing paths since the entire signal histories are used for damage identification. Model-based methods are time-consuming due to the requirement for the inverse wave propagation solution, which is especially true for the large 3D structures.

The motivation of the proposed method is to develop efficient and accurate model-based damage imaging technique with limited data. The special focus is on the efficiency of the damage imaging algorithm as it is the major bottleneck of the model-assisted approach. The computational efficiency is achieved by two complimentary components. First, a fast forward wave propagation solver is developed, which is verified with the classical Finite Element(FEM) solution and the speed is 10-20 times faster. Next, efficient inverse wave propagation algorithms is proposed. Classical gradient-based optimization algorithms usually require finite difference method for gradient calculation, which is prohibitively expensive for large degree of freedoms. An adjoint method-based optimization algorithms is proposed, which avoids the repetitive finite difference calculations for every imaging variables. Thus, superior computational efficiency can be achieved by combining these two methods together for the damage imaging. A coupled Piezoelectric (PZT) damage imaging model is proposed to include the interaction between PZT and host structure. Following the formulation of the framework, experimental validation is performed on isotropic and anisotropic material with defects such as cracks, delamination, and voids. The results show that the proposed method can detect and reconstruct multiple damage simultaneously and efficiently, which is promising to be applied to complex large-scale engineering structures.

ContributorsChang, Qinan (Author) / Liu, Yongming (Thesis advisor) / Mignolet, Marc (Committee member) / Chattopadhyay, Aditi (Committee member) / Yan, Hao (Committee member) / Ren, Yi (Committee member) / Arizona State University (Publisher)

Created2019

DNA nanostructure as a scaffold for immunological applications

Description

DNA nanotechnology has been a rapidly growing research field in the recent decades, and there have been extensive efforts to construct various types of highly programmable and robust DNA nanostructures. Due to the advantage that DNA nanostructure can be used to organize biochemical molecules with precisely controlled spatial resolution, herein…

DNA nanotechnology has been a rapidly growing research field in the recent decades, and there have been extensive efforts to construct various types of highly programmable and robust DNA nanostructures. Due to the advantage that DNA nanostructure can be used to organize biochemical molecules with precisely controlled spatial resolution, herein we used DNA nanostructure as a scaffold for biological applications. Targeted cell-cell interaction was reconstituted through a DNA scaffolded multivalent bispecific aptamer, which may lead to promising potentials in tumor therapeutics. In addition a synthetic vaccine was constructed using DNA nanostructure as a platform to assemble both model antigen and immunoadjuvant together, and strong antibody response was demonstrated in vivo, highlighting the potential of DNA nanostructures to serve as a new platform for vaccine construction, and therefore a DNA scaffolded hapten vaccine is further constructed and tested for its antibody response. Taken together, my research demonstrated the potential of DNA nanostructure to serve as a general platform for immunological applications.

ContributorsLiu, Xiaowei (Author) / Liu, Yan (Thesis advisor) / Chang, Yung (Thesis advisor) / Yan, Hao (Committee member) / Allen, James (Committee member) / Zhang, Peiming (Committee member) / Arizona State University (Publisher)

Created2012

Atomic force microscopy for chromatin structure study

Description

In eukaryotes, DNA is packed in a highly condensed and hierarchically organized structure called chromatin, in which DNA tightly wraps around the histone octamer consisting of one histone 3-histone 4 (H3-H4) tetramer and two histone 2A- histone 2B (H2A-H2B) dimers with 147 base pairs in an almost two left handed…

In eukaryotes, DNA is packed in a highly condensed and hierarchically organized structure called chromatin, in which DNA tightly wraps around the histone octamer consisting of one histone 3-histone 4 (H3-H4) tetramer and two histone 2A- histone 2B (H2A-H2B) dimers with 147 base pairs in an almost two left handed turns. Almost all DNA dependent cellular processes, such as DNA duplication, transcription, DNA repair and recombination, take place in the chromatin form. Based on the critical importance of appropriate chromatin condensation, this thesis focused on the folding behavior of the nucleosome array reconstituted using different templates with various controllable factors such as histone tail modification, linker DNA length, and DNA binding proteins. Firstly, the folding behaviors of wild type (WT) and nucleosome arrays reconstituted with acetylation on the histone H4 at lysine 16 (H4K16 (Ac)) were studied. In contrast to the sedimentation result, atomic force microscopy (AFM) measurements revealed no apparent difference in the compact nucleosome arrays between WT and H4K16 (Ac) and WT. Instead, an optimal loading of nucleosome along the template was found necessary for the Mg2+ induced nucleosome array compaction. This finding leads to the further study on the role of linker DNA in the nucleosome compaction. A method of constructing DNA templates with varied linker DNA lengths was developed, and uniformly and randomly spaced nucleosome arrays with average linker DNA lengths of 30 bp and 60 bp were constructed. After comprehensive analyses of the nucleosome arrays' structure in mica surface, the lengths of the linker DNA were found playing an important role in controlling the structural geometries of nucleosome arrays in both their extended and compact forms. In addition, higher concentration of the DNA binding domain of the telomere repeat factor 2 (TRF2) was found to stimulate the compaction of the telomeric nucleosome array. Finally, AFM was successfully applied to investigate the nucleosome positioning behaviors on the Mouse Mammary Tumor Virus (MMTV) promoter region, and two highly positioned region corresponded to nucleosome A and B were identified by this method.

ContributorsFu, Qiang (Author) / Lindsay, Stuart M (Thesis advisor) / Yan, Hao (Committee member) / Ghirlanda, Giovanna (Committee member) / Arizona State University (Publisher)

Created2010

Exploring the telomeric repeat addition processivity of vertebrate telomerase

Description

Telomerase is a special reverse transcriptase that extends the linear chromosome termini in eukaryotes. Telomerase is also a unique ribonucleoprotein complex which is composed of the protein component called Telomerase Reverse Transcriptase (TERT) and a telomerase RNA component (TR). The enzyme from most vertebrate species is able to utilize a…

Telomerase is a special reverse transcriptase that extends the linear chromosome termini in eukaryotes. Telomerase is also a unique ribonucleoprotein complex which is composed of the protein component called Telomerase Reverse Transcriptase (TERT) and a telomerase RNA component (TR). The enzyme from most vertebrate species is able to utilize a short template sequence within TR to synthesize a long stretch of telomeric DNA, an ability termed "repeat addition processivity". By using human telomerase reconstituted both in vitro (Rabbit Reticulocyte Lysate) and in vivo (293FT cells), I have demonstrated that a conserved motif in the reverse transcriptase domain of the telomerase protein is crucial for telomerase repeat addition processivity and rate. Furthermore, I have designed a "template-free" telomerase to show that RNA/DNA duplex binding is a critical step for telomere repeat synthesis. In an attempt to expand the understanding of vertebrate telomerase, I have studied RNA-protein interactions of telomerase from teleost fish. The teleost fish telomerase RNA (TR) is by far the smallest vertebrate TR identified, providing a valuable model for structural research.

ContributorsXie, Mingyi (Author) / Chen, Julian J.L. (Thesis advisor) / Yan, Hao (Committee member) / Wachter, Rebekka M. (Committee member) / Arizona State University (Publisher)

Created2010

Uncertainty Quantification and Prognostics using Bayesian Statistics and Machine Learning

Description

Uncertainty quantification is critical for engineering design and analysis. Determining appropriate ways of dealing with uncertainties has been a constant challenge in engineering. Statistical methods provide a powerful aid to describe and understand uncertainties. This work focuses on applying Bayesian methods and machine learning in uncertainty quantification and prognostics among…

Uncertainty quantification is critical for engineering design and analysis. Determining appropriate ways of dealing with uncertainties has been a constant challenge in engineering. Statistical methods provide a powerful aid to describe and understand uncertainties. This work focuses on applying Bayesian methods and machine learning in uncertainty quantification and prognostics among all the statistical methods. This study focuses on the mechanical properties of materials, both static and fatigue, the main engineering field on which this study focuses. This work can be summarized in the following items: First, maintaining the safety of vintage pipelines requires accurately estimating the strength. The objective is to predict the reliability-based strength using nondestructive multimodality surface information. Bayesian model averaging (BMA) is implemented for fusing multimodality non-destructive testing results for gas pipeline strength estimation. Several incremental improvements are proposed in the algorithm implementation. Second, the objective is to develop a statistical uncertainty quantification method for fatigue stress-life (S-N) curves with sparse data.Hierarchical Bayesian data augmentation (HBDA) is proposed to integrate hierarchical Bayesian modeling (HBM) and Bayesian data augmentation (BDA) to deal with sparse data problems for fatigue S-N curves. The third objective is to develop a physics-guided machine learning model to overcome limitations in parametric regression models and classical machine learning models for fatigue data analysis. A Probabilistic Physics-guided Neural Network (PPgNN) is proposed for probabilistic fatigue S-N curve estimation. This model is further developed for missing data and arbitrary output distribution problems. Fourth, multi-fidelity modeling combines the advantages of low- and high-fidelity models to achieve a required accuracy at a reasonable computation cost. The fourth objective is to develop a neural network approach for multi-fidelity modeling by learning the correlation between low- and high-fidelity models. Finally, conclusions are drawn, and future work is outlined based on the current study.

ContributorsChen, Jie (Author) / Liu, Yongming (Thesis advisor) / Chattopadhyay, Aditi (Committee member) / Mignolet, Marc (Committee member) / Ren, Yi (Committee member) / Yan, Hao (Committee member) / Arizona State University (Publisher)

Created2022

Development of Tools for Planning and Coordinating the Production of Small Farmers as a Response to Market Opportunities

Description

For multiple reasons, the consumption of fresh fruits and vegetables in the United States has progressively increased. This has resulted in increased domestic production and importation of these products. The associated logistics is complex due to the perishability of these products, and most current logistics systems rely on marketing and…

For multiple reasons, the consumption of fresh fruits and vegetables in the United States has progressively increased. This has resulted in increased domestic production and importation of these products. The associated logistics is complex due to the perishability of these products, and most current logistics systems rely on marketing and supply chains practices that result in high levels of food waste and limited offer diversity. For instance, given the lack of critical mass, small growers are conspicuously absent from mainstream distribution channels. One way to obtain these critical masses is using associative schemes such as co-ops. However, the success level of traditional associate schemes has been mixed at best. This dissertation develops decision support tools to facilitate the formation of coalitions of small growers in complementary production regions to act as a single-like supplier. Thus, this dissertation demonstrates the benefits and efficiency that could be achieved by these coalitions, presents a methodology to efficiently distribute the value of a new identified market opportunity among the growers participating in the coalition, and develops a negotiation framework between a buyer(s) and the agent representing the coalition that results in a prototype contract.There are four main areas of research contributions in this dissertation. The first is the development of optimization tools to allocate a market opportunity to potential production regions while considering consumer preferences for special denomination labels such as “local”, “organic”, etc. The second contribution is in the development of a stochastic optimization and revenue-distribution framework for the formation of coalitions of growers to maximize the captured value of a market opportunity. The framework considers the growers’ individual preferences and production characteristics (yields, resources, etc.) to develop supply contracts that entice their participation in the coalition. The third area is the development of a negotiation mechanism to design contracts between buyers and groups of growers considering the profit expectations and the variability of the future demand. The final contribution is the integration of these models and tools into a framework capable of transforming new market opportunities into implementable production plans and contractual agreement between the different supply chain participants.

ContributorsUlloa, Rodrigo (Author) / Villalobos, Jesus (Thesis advisor) / Fowler, John (Committee member) / Mac Cawley, Alejandro (Committee member) / Yan, Hao (Committee member) / Phelan, Patrick (Committee member) / Arizona State University (Publisher)

Created2022

ASU Electronic Theses and Dissertations