Search Content

Directed evolution of gp120 binding mutants of the lectin Cyanovirin-N

Description

Cyanovirin-N (CV-N) is a naturally occurring lectin originally isolated from the cyanobacteria Nostoc ellipsosporum. This 11 kDa lectin is 101 amino acids long with two binding sites, one at each end of the protein. CV-N specifically binds to terminal Manα1-2Manα motifs on the branched, high mannose Man9 and Man8 glycosylations…

Cyanovirin-N (CV-N) is a naturally occurring lectin originally isolated from the cyanobacteria Nostoc ellipsosporum. This 11 kDa lectin is 101 amino acids long with two binding sites, one at each end of the protein. CV-N specifically binds to terminal Manα1-2Manα motifs on the branched, high mannose Man9 and Man8 glycosylations found on enveloped viruses including Ebola, Influenza, and HIV. wt-CVN has micromolar binding to soluble Manα1-2Manα and also inhibits HIV entry at low nanomolar concentrations. CV-N's high affinity and specificity for Manα1-2Manα makes it an excellent lectin to study for its glycan-specific properties. The long-term aim of this project is to make a variety of mutant CV-Ns to specifically bind other glycan targets. Such a set of lectins may be used as screening reagents to identify biomarkers and other glycan motifs of interest. As proof of concept, a T7 phage display library was constructed using P51G-m4-CVN genes mutated at positions 41, 44, 52, 53, 56, 74, and 76 in binding Domain B. Five CV-N mutants were selected from the library and expressed in BL21(DE3) E. coli. Two of the mutants, SSDGLQQ-P51Gm4-CVN and AAGRLSK-P51Gm4-CVN, were sufficiently stable for characterization and were examined by CD, Tm, ELISA, and glycan array. Both proteins have CD minima at approximately 213 nm, indicating largely β-sheet structure, and have Tm values greater than 40°C. ELISA against gp120 and RNase B demonstrate both proteins' ability to bind high mannose glycans. To more specifically determine the binding specificity of each protein, AAGRLSK-P51Gm4-CVN, SSDGLQQ-P51Gm4-CVN, wt-CVN, and P51G-m4-CVN were sent to the Consortium for Functional Glycomics (CFG) for glycan array analysis. AAGRLSK-P51Gm4-CVN, wt-CVN, and P51G-m4-CVN, have identical specificities for high mannose glycans containing terminal Manα1-2Manα. SSDGLQQ-P51Gm4-CVN binds to terminal GlcNAcα1-4Gal motifs and a subgroup of high mannose glycans bound by P51G-m4-CVN. SSDGLQQ-wt-CVN was produced to restore anti-HIV activity and has a high nanomolar EC50 value compared to wt-CVN's low nanomolar activity. Overall, these experiments show that CV-N Domain B can be mutated and retain specificity identical to wt-CVN or acquire new glycan specificities. This first generation information can be used to produce glycan-specific lectins for a variety of applications.

ContributorsRuben, Melissa (Author) / Ghirlanda, Giovanna (Thesis advisor) / Allen, James (Committee member) / Wachter, Rebekka (Committee member) / Arizona State University (Publisher)

Created2013

Optimal experimental design for accelerated life testing and design evaluation

Description

Nowadays product reliability becomes the top concern of the manufacturers and customers always prefer the products with good performances under long period. In order to estimate the lifetime of the product, accelerated life testing (ALT) is introduced because most of the products can last years even decades. Much research has…

Nowadays product reliability becomes the top concern of the manufacturers and customers always prefer the products with good performances under long period. In order to estimate the lifetime of the product, accelerated life testing (ALT) is introduced because most of the products can last years even decades. Much research has been done in the ALT area and optimal design for ALT is a major topic. This dissertation consists of three main studies. First, a methodology of finding optimal design for ALT with right censoring and interval censoring have been developed and it employs the proportional hazard (PH) model and generalized linear model (GLM) to simplify the computational process. A sensitivity study is also given to show the effects brought by parameters to the designs. Second, an extended version of I-optimal design for ALT is discussed and then a dual-objective design criterion is defined and showed with several examples. Also in order to evaluate different candidate designs, several graphical tools are developed. Finally, when there are more than one models available, different model checking designs are discussed.

ContributorsYang, Tao (Author) / Pan, Rong (Thesis advisor) / Montgomery, Douglas C. (Committee member) / Borror, Connie (Committee member) / Rigdon, Steve (Committee member) / Arizona State University (Publisher)

Created2013

Spatio-temporal data mining to detect changes and clusters in trajectories

Description

With the rapid development of mobile sensing technologies like GPS, RFID, sensors in smartphones, etc., capturing position data in the form of trajectories has become easy. Moving object trajectory analysis is a growing area of interest these days owing to its applications in various domains such as marketing, security, traffic…

With the rapid development of mobile sensing technologies like GPS, RFID, sensors in smartphones, etc., capturing position data in the form of trajectories has become easy. Moving object trajectory analysis is a growing area of interest these days owing to its applications in various domains such as marketing, security, traffic monitoring and management, etc. To better understand movement behaviors from the raw mobility data, this doctoral work provides analytic models for analyzing trajectory data. As a first contribution, a model is developed to detect changes in trajectories with time. If the taxis moving in a city are viewed as sensors that provide real time information of the traffic in the city, a change in these trajectories with time can reveal that the road network has changed. To detect changes, trajectories are modeled with a Hidden Markov Model (HMM). A modified training algorithm, for parameter estimation in HMM, called m-BaumWelch, is used to develop likelihood estimates under assumed changes and used to detect changes in trajectory data with time. Data from vehicles are used to test the method for change detection. Secondly, sequential pattern mining is used to develop a model to detect changes in frequent patterns occurring in trajectory data. The aim is to answer two questions: Are the frequent patterns still frequent in the new data? If they are frequent, has the time interval distribution in the pattern changed? Two different approaches are considered for change detection, frequency-based approach and distribution-based approach. The methods are illustrated with vehicle trajectory data. Finally, a model is developed for clustering and outlier detection in semantic trajectories. A challenge with clustering semantic trajectories is that both numeric and categorical attributes are present. Another problem to be addressed while clustering is that trajectories can be of different lengths and also have missing values. A tree-based ensemble is used to address these problems. The approach is extended to outlier detection in semantic trajectories.

ContributorsKondaveeti, Anirudh (Author) / Runger, George C. (Thesis advisor) / Mirchandani, Pitu (Committee member) / Pan, Rong (Committee member) / Maciejewski, Ross (Committee member) / Arizona State University (Publisher)

Created2012

Characterization of SMN and gemin2: insights into spinal muscular atrophy

Description

Spinal muscular atrophy (SMA) is a neurodegenerative disease that results in the loss of lower body muscle function. SMA is the second leading genetic cause of death in infants and arises from the loss of the Survival of Motor Neuron (SMN) protein. SMN is produced by two genes, smn1 and…

Spinal muscular atrophy (SMA) is a neurodegenerative disease that results in the loss of lower body muscle function. SMA is the second leading genetic cause of death in infants and arises from the loss of the Survival of Motor Neuron (SMN) protein. SMN is produced by two genes, smn1 and smn2, that are identical with the exception of a C to T conversion in exon 7 of the smn2 gene. SMA patients lacking the smn1 gene, rely on smn2 for production of SMN. Due to an alternative splicing event, smn2 primarily encodes a non-functional SMN lacking exon 7 (SMN D7) as well as a low amount of functional full-length SMN (SMN WT). SMN WT is ubiquitously expressed in all cell types, and it remains unclear how low levels of SMN WT in motor neurons lead to motor neuron degradation and SMA. SMN and its associated proteins, Gemin2-8 and Unrip, make up a large dynamic complex that functions to assemble ribonucleoproteins. The aim of this project was to characterize the interactions of the core SMN-Gemin2 complex, and to identify differences between SMN WT and SMN D7. SMN and Gemin2 proteins were expressed, purified and characterized via size exclusion chromatography. A stable N-terminal deleted Gemin2 protein (N45-G2) was characterized. The SMN WT expression system was optimized resulting in a 10-fold increase of protein expression. Lastly, the oligomeric states of SMN and SMN bound to Gemin2 were determined. SMN WT formed a mixture of oligomeric states, while SMN D7 did not. Both SMN WT and D7 bound to Gemin2 with a one-to-one ratio forming a heterodimer and several higher-order oligomeric states. The SMN WT-Gemin2 complex favored high molecular weight oligomers whereas the SMN D7-Gemin2 complex formed low molecular weight oligomers. These results indicate that the SMA mutant protein, SMN D7, was still able to associate with Gemin2, but was not able to form higher-order oligomeric complexes. The observed multiple oligomerization states of SMN and SMN bound to Gemin2 may play a crucial role in regulating one or several functions of the SMN protein. The inability of SMN D7 to form higher-order oligomers may inhibit or alter those functions leading to the SMA disease phenotype.

ContributorsNiday, Tracy (Author) / Allen, James P. (Thesis advisor) / Wachter, Rebekka (Committee member) / Ghirlanda, Giovanna (Committee member) / Arizona State University (Publisher)

Created2012

Learning from asymmetric models and matched pairs

Description

With the increase in computing power and availability of data, there has never been a greater need to understand data and make decisions from it. Traditional statistical techniques may not be adequate to handle the size of today's data or the complexities of the information hidden within the data. Thus…

With the increase in computing power and availability of data, there has never been a greater need to understand data and make decisions from it. Traditional statistical techniques may not be adequate to handle the size of today's data or the complexities of the information hidden within the data. Thus knowledge discovery by machine learning techniques is necessary if we want to better understand information from data. In this dissertation, we explore the topics of asymmetric loss and asymmetric data in machine learning and propose new algorithms as solutions to some of the problems in these topics. We also studied variable selection of matched data sets and proposed a solution when there is non-linearity in the matched data. The research is divided into three parts. The first part addresses the problem of asymmetric loss. A proposed asymmetric support vector machine (aSVM) is used to predict specific classes with high accuracy. aSVM was shown to produce higher precision than a regular SVM. The second part addresses asymmetric data sets where variables are only predictive for a subset of the predictor classes. Asymmetric Random Forest (ARF) was proposed to detect these kinds of variables. The third part explores variable selection for matched data sets. Matched Random Forest (MRF) was proposed to find variables that are able to distinguish case and control without the restrictions that exists in linear models. MRF detects variables that are able to distinguish case and control even in the presence of interaction and qualitative variables.

ContributorsKoh, Derek (Author) / Runger, George C. (Thesis advisor) / Wu, Tong (Committee member) / Pan, Rong (Committee member) / Cesta, John (Committee member) / Arizona State University (Publisher)

Created2013

Protein folding & dynamics using multi-scale computational methods

Description

This thesis explores a wide array of topics related to the protein folding problem, ranging from the folding mechanism, ab initio structure prediction and protein design, to the mechanism of protein functional evolution, using multi-scale approaches. To investigate the role of native topology on folding mechanism, the native topology is…

This thesis explores a wide array of topics related to the protein folding problem, ranging from the folding mechanism, ab initio structure prediction and protein design, to the mechanism of protein functional evolution, using multi-scale approaches. To investigate the role of native topology on folding mechanism, the native topology is dissected into non-local and local contacts. The number of non-local contacts and non-local contact orders are both negatively correlated with folding rates, suggesting that the non-local contacts dominate the barrier-crossing process. However, local contact orders show positive correlation with folding rates, indicating the role of a diffusive search in the denatured basin. Additionally, the folding rate distribution of E. coli and Yeast proteomes are predicted from native topology. The distribution is fitted well by a diffusion-drift population model and also directly compared with experimentally measured half life. The results indicate that proteome folding kinetics is limited by protein half life. The crucial role of local contacts in protein folding is further explored by the simulations of WW domains using Zipping and Assembly Method. The correct formation of N-terminal β-turn turns out important for the folding of WW domains. A classification model based on contact probabilities of five critical local contacts is constructed to predict the foldability of WW domains with 81% accuracy. By introducing mutations to stabilize those critical local contacts, a new protein design approach is developed to re-design the unfoldable WW domains and make them foldable. After folding, proteins exhibit inherent conformational dynamics to be functional. Using molecular dynamics simulations in conjunction with Perturbation Response Scanning, it is demonstrated that the divergence of functions can occur through the modification of conformational dynamics within existing fold for β-lactmases and GFP-like proteins: i) the modern TEM-1 lactamase shows a comparatively rigid active-site region, likely reflecting adaptation for efficient degradation of a specific substrate, while the resurrected ancient lactamases indicate enhanced active-site flexibility, which likely allows for the binding and subsequent degradation of different antibiotic molecules; ii) the chromophore and attached peptides of photocoversion-competent GFP-like protein exhibits higher flexibility than the photocoversion-incompetent one, consistent with the evolution of photocoversion capacity.

ContributorsZou, Taisong (Author) / Ozkan, Sefika B (Thesis advisor) / Thorpe, Michael F (Committee member) / Woodbury, Neal W (Committee member) / Vaiana, Sara M (Committee member) / Ghirlanda, Giovanna (Committee member) / Arizona State University (Publisher)

Created2014

Interactions driving the collapse of islet amyloid polypeptide: implications for amyloid aggregation

Description

Human islet amyloid polypeptide (hIAPP), also known as amylin, is a 37-residue intrinsically disordered hormone involved in glucose regulation and gastric emptying. The aggregation of hIAPP into amyloid fibrils is believed to play a causal role in type 2 diabetes. To date, not much is known about the monomeric state…

Human islet amyloid polypeptide (hIAPP), also known as amylin, is a 37-residue intrinsically disordered hormone involved in glucose regulation and gastric emptying. The aggregation of hIAPP into amyloid fibrils is believed to play a causal role in type 2 diabetes. To date, not much is known about the monomeric state of hIAPP or how it undergoes an irreversible transformation from disordered peptide to insoluble aggregate. IAPP contains a highly conserved disulfide bond that restricts hIAPP(1-8) into a short ring-like structure: N_loop. Removal or chemical reduction of N_loop not only prevents cell response upon binding to the CGRP receptor, but also alters the mass per length distribution of hIAPP fibers and the kinetics of fibril formation. The mechanism by which N_loop affects hIAPP aggregation is not yet understood, but is important for rationalizing kinetics and developing potential inhibitors. By measuring end-to-end contact formation rates, Vaiana et al. showed that N_loop induces collapsed states in IAPP monomers, implying attractive interactions between N_loop and other regions of the disordered polypeptide chain . We show that in addition to being involved in intra-protein interactions, the N_loop is involved in inter-protein interactions, which lead to the formation of extremely long and stable β-turn fibers. These non-amyloid fibers are present in the 10 μM concentration range, under the same solution conditions in which hIAPP forms amyloid fibers. We discuss the effect of peptide cyclization on both intra- and inter-protein interactions, and its possible implications for aggregation. Our findings indicate a potential role of N_loop-N_loop interactions in hIAPP aggregation, which has not previously been explored. Though our findings suggest that N_loop plays an important role in the pathway of amyloid formation, other naturally occurring IAPP variants that contain this structural feature are incapable of forming amyloids. For example, hIAPP readily forms amyloid ﬁbrils in vitro, whereas the rat variant (rIAPP), differing by six amino acids, does not. In addition to being highly soluble, rIAPP is an effective inhibitor of hIAPP ﬁbril formation . Both of these properties have been attributed to rIAPP's three proline residues: A25P, S28P and S29P. Single proline mutants of hIAPP have also been shown to kinetically inhibit hIAPP fibril formation. Because of their intrinsic dihedral angle preferences, prolines are expected to affect conformational ensembles of intrinsically disordered proteins. The specific effect of proline substitutions on IAPP structure and dynamics has not yet been explored, as the detection of such properties is experimentally challenging due to the low molecular weight, fast reconfiguration times, and very low solubility of IAPP peptides. High-resolution techniques able to measure tertiary contact formations are needed to address this issue. We employ a nanosecond laser spectroscopy technique to measure end-to-end contact formation rates in IAPP mutants. We explore the proline substitutions in IAPP and quantify their effects in terms of intrinsic chain stiffness. We find that the three proline mutations found in rIAPP increase chain stiffness. Interestingly, we also find that residue R18 plays an important role in rIAPP's unique chain stiffness and, together with the proline residues, is a determinant for its non-amyloidogenic properties. We discuss the implications of our findings on the role of prolines in IDPs.

ContributorsCope, Stephanie M (Author) / Vaiana, Sara M (Thesis advisor) / Ghirlanda, Giovanna (Committee member) / Ros, Robert (Committee member) / Lindsay, Stuart M (Committee member) / Ozkan, Sefika B (Committee member) / Arizona State University (Publisher)

Created2013

Exploring the regulation of the telomerase reaction cycle through unique protein, DNA, and RNA interactions

Description

Telomerase is a unique reverse transcriptase that has evolved specifically to extend the single stranded DNA at the 3' ends of chromosomes. To achieve this, telomerase uses a small section of its integral RNA subunit (TR) to reiteratively copy a short, canonically 6-nt, sequence repeatedly in a processive manner using…

Telomerase is a unique reverse transcriptase that has evolved specifically to extend the single stranded DNA at the 3' ends of chromosomes. To achieve this, telomerase uses a small section of its integral RNA subunit (TR) to reiteratively copy a short, canonically 6-nt, sequence repeatedly in a processive manner using a complex and currently poorly understood mechanism of template translocation to stop nucleotide addition, regenerate its template, and then synthesize a new repeat. In this study, several novel interactions between the telomerase protein and RNA components along with the DNA substrate are identified and characterized which come together to allow active telomerase repeat addition. First, this study shows that the sequence of the RNA/DNA duplex holds a unique, single nucleotide signal which pauses DNA synthesis at the end of the canonical template sequence. Further characterization of this sequence dependent pause signal reveals that the template sequence alone can produce telomerase products with the characteristic 6-nt pattern, but also works cooperatively with another RNA structural element for proper template boundary definition. Finally, mutational analysis is used on several regions of the protein and RNA components of telomerase to identify crucial determinates of telomerase assembly and processive repeat synthesis. Together, these results shed new light on how telomerase coordinates its complex catalytic cycle.

ContributorsBrown, Andrew F (Author) / Chen, Julian J. L. (Thesis advisor) / Jones, Anne (Committee member) / Ghirlanda, Giovanna (Committee member) / Arizona State University (Publisher)

Created2014

Simulation-based Bayesian optimal accelerated life test design and model discrimination

Description

Accelerated life testing (ALT) is the process of subjecting a product to stress conditions (temperatures, voltage, pressure etc.) in excess of its normal operating levels to accelerate failures. Product failure typically results from multiple stresses acting on it simultaneously. Multi-stress factor ALTs are challenging as they increase the number of…

Accelerated life testing (ALT) is the process of subjecting a product to stress conditions (temperatures, voltage, pressure etc.) in excess of its normal operating levels to accelerate failures. Product failure typically results from multiple stresses acting on it simultaneously. Multi-stress factor ALTs are challenging as they increase the number of experiments due to the stress factor-level combinations resulting from the increased number of factors. Chapter 2 provides an approach for designing ALT plans with multiple stresses utilizing Latin hypercube designs that reduces the simulation cost without loss of statistical efficiency. A comparison to full grid and large-sample approximation methods illustrates the approach computational cost gain and flexibility in determining optimal stress settings with less assumptions and more intuitive unit allocations.

Implicit in the design criteria of current ALT designs is the assumption that the form of the acceleration model is correct. This is unrealistic assumption in many real-world problems. Chapter 3 provides an approach for ALT optimum design for model discrimination. We utilize the Hellinger distance measure between predictive distributions. The optimal ALT plan at three stress levels was determined and its performance was compared to good compromise plan, best traditional plan and well-known 4:2:1 compromise test plans. In the case of linear versus quadratic ALT models, the proposed method increased the test plan's ability to distinguish among competing models and provided better guidance as to which model is appropriate for the experiment.

Chapter 4 extends the approach of Chapter 3 to ALT sequential model discrimination. An initial experiment is conducted to provide maximum possible information with respect to model discrimination. The follow-on experiment is planned by leveraging the most current information to allow for Bayesian model comparison through posterior model probability ratios. Results showed that performance of plan is adversely impacted by the amount of censoring in the data, in the case of linear vs. quadratic model form at three levels of constant stress, sequential testing can improve model recovery rate by approximately 8% when data is complete, but no apparent advantage in adopting sequential testing was found in the case of right-censored data when censoring is in excess of a certain amount.

ContributorsNasir, Ehab (Author) / Pan, Rong (Thesis advisor) / Runger, George C. (Committee member) / Gel, Esma (Committee member) / Kao, Ming-Hung (Committee member) / Montgomery, Douglas C. (Committee member) / Arizona State University (Publisher)

Created2014

Protein post translational modifications in human diseases: bacterial glycosylation profiling by peptide microarray protein phosphorylation analysis in high risk neuroblastoma

Description

ABSTRACT

Post Translational Modifications (PTMs) are a series of chemical modifications with the capacity to expand the structural and functional repertoire of proteins. PTMs can regulate protein-protein interaction, localization, protein turn-over, the active state of the protein, and much more. This can dramatically affect cell processes as relevant…

ABSTRACT

Post Translational Modifications (PTMs) are a series of chemical modifications with the capacity to expand the structural and functional repertoire of proteins. PTMs can regulate protein-protein interaction, localization, protein turn-over, the active state of the protein, and much more. This can dramatically affect cell processes as relevant as gene expression, cell-cell recognition, and cell signaling. Along these lines, this Ph.D. thesis examines the role of two of the most important PTMs: glycosylation and phosphorylation.

In chapters 2, 3 and 4, a 10,000 peptide microarray is used to analyze the glycan variations in a series lipopolysaccharides (LPS) from Gram negative bacteria. This research was the first to demonstrate that using a small subset of random sequence peptides, it was possible to identify a small subset with the capacity to bind to the LPS of bacteria. These peptides bound to LPS not only in the solid surface of the array but also in solution as demonstrated with surface plasmon resonance (SPR), isothermal titration calorimetry (ITC) and flow cytometry. Interestingly, some of the LPS binding peptides also exhibit antimicrobial activity, a property that is also analyzed in this work.

In chapters 5 and 6, the role of protein phosphorylation, another PTM, is analyzed in the context of human cancer. High risk neuroblastoma, a very aggressive pediatric cancer, was studied with emphasis on the phosphorylations of two selected oncoproteins: the transcription factor NMYC and the adaptor protein ShcC. Both proteins were isolated from high risk neuroblastoma cells, and a targeted-directed tandem mass spectrometry (LC-MS/MS) methodology was used to identify the phosphorylation sites in each protein. Using this method dramatically improved the phosphorylation site detection and increased the number of sites detected up to 250% in comparison with previous studies. Several of the novel identified sites were located in functional domain of the proteins and that some of them are homologous to known active sites in other proteins of the same family. The chapter concludes with a computational prediction of the kinases that potentially phosphorylate those sites and a series of assays to show this phosphorylation occurred in vitro.

ContributorsMorales Betanzos, Carlos (Author) / LaBaer, Joshua (Thesis advisor) / Allen, James (Committee member) / Ghirlanda, Giovanna (Committee member) / Arizona State University (Publisher)

Created2014

ASU Electronic Theses and Dissertations

Filtering by

Directed evolution of gp120 binding mutants of the lectin Cyanovirin-N

Optimal experimental design for accelerated life testing and design evaluation

Spatio-temporal data mining to detect changes and clusters in trajectories

Characterization of SMN and gemin2: insights into spinal muscular atrophy

Learning from asymmetric models and matched pairs

Protein folding & dynamics using multi-scale computational methods

Interactions driving the collapse of islet amyloid polypeptide: implications for amyloid aggregation

Exploring the regulation of the telomerase reaction cycle through unique protein, DNA, and RNA interactions

Simulation-based Bayesian optimal accelerated life test design and model discrimination

Protein post translational modifications in human diseases: bacterial glycosylation profiling by peptide microarray protein phosphorylation analysis in high risk neuroblastoma