Search Content

In vitro selection of aptamers and protein

Description

Since Darwin popularized the evolution theory in 1895, it has been completed and studied through the years. Starting in 1990s, evolution at molecular level has been used to discover functional molecules while studying the origin of functional molecules in nature by mimicing the natural selection process in laboratory. Along this…

Since Darwin popularized the evolution theory in 1895, it has been completed and studied through the years. Starting in 1990s, evolution at molecular level has been used to discover functional molecules while studying the origin of functional molecules in nature by mimicing the natural selection process in laboratory. Along this line, my Ph.D. dissertation focuses on the in vitro selection of two important biomolecules, deoxynucleotide acid (DNA) and protein with binding properties. Chapter two focuses on in vitro selection of DNA. Aptamers are single-stranded nucleic acids that generated from a random pool and fold into stable three-dimensional structures with ligand binding sites that are complementary in shape and charge to a desired target. While aptamers have been selected to bind a wide range of targets, it is generally thought that these molecules are incapable of discriminating strongly alkaline proteins due to the attractive forces that govern oppositely charged polymers. By employing negative selection step to eliminate aptamers that bind with off-target through charge unselectively, an aptamer that binds with histone H4 protein with high specificity (>100 fold)was generated. Chapter four focuses on another functional molecule: protein. It is long believed that complex molecules with different function originated from simple progenitor proteins, but very little is known about this process. By employing a previously selected protein that binds and catalyzes ATP, which is the first and only protein that was evolved completely from random pool and has a unique α/β-fold protein scaffold, I fused random library to the C-terminus of this protein and evolved a multi-domain protein with decent properties. Also, in chapter 3, a unique bivalent molecule was generated by conjugating peptides that bind different sites on the protein with nucleic acids. By using the ligand interactions by nucleotide conjugates technique, off-the shelf peptide was transferred into high affinity protein capture reagents that mimic the recognition properties of natural antibodies. The designer synthetic antibody amplifies the binding affinity of the individual peptides by ∼1000-fold to bind Grb2 with a Kd of 2 nM, and functions with high selectivity in conventional pull-down assays from HeLa cell lysates.

ContributorsJiang, Bing (Author) / Chaput, John C (Thesis advisor) / Chen, Julian (Committee member) / Liu, Yan (Committee member) / Arizona State University (Publisher)

Created2013

Structured sparse learning and its applications to biomedical and biological data

Description

Sparsity has become an important modeling tool in areas such as genetics, signal and audio processing, medical image processing, etc. Via the penalization of l-1 norm based regularization, the structured sparse learning algorithms can produce highly accurate models while imposing various predefined structures on the data, such as feature groups…

Sparsity has become an important modeling tool in areas such as genetics, signal and audio processing, medical image processing, etc. Via the penalization of l-1 norm based regularization, the structured sparse learning algorithms can produce highly accurate models while imposing various predefined structures on the data, such as feature groups or graphs. In this thesis, I first propose to solve a sparse learning model with a general group structure, where the predefined groups may overlap with each other. Then, I present three real world applications which can benefit from the group structured sparse learning technique. In the first application, I study the Alzheimer's Disease diagnosis problem using multi-modality neuroimaging data. In this dataset, not every subject has all data sources available, exhibiting an unique and challenging block-wise missing pattern. In the second application, I study the automatic annotation and retrieval of fruit-fly gene expression pattern images. Combined with the spatial information, sparse learning techniques can be used to construct effective representation of the expression images. In the third application, I present a new computational approach to annotate developmental stage for Drosophila embryos in the gene expression images. In addition, it provides a stage score that enables one to more finely annotate each embryo so that they are divided into early and late periods of development within standard stage demarcations. Stage scores help us to illuminate global gene activities and changes much better, and more refined stage annotations improve our ability to better interpret results when expression pattern matches are discovered between genes.

ContributorsYuan, Lei (Author) / Ye, Jieping (Thesis advisor) / Wang, Yalin (Committee member) / Xue, Guoliang (Committee member) / Kumar, Sudhir (Committee member) / Arizona State University (Publisher)

Created2013

Development of an artificial genetic system capable of Darwinian evolution

Description

The principle of Darwinian evolution has been applied in the laboratory to nucleic acid molecules since 1990, and led to the emergence of in vitro evolution technique. The methodology of in vitro evolution surveys a large number of different molecules simultaneously for a pre-defined chemical property, and enrich for molecules…

The principle of Darwinian evolution has been applied in the laboratory to nucleic acid molecules since 1990, and led to the emergence of in vitro evolution technique. The methodology of in vitro evolution surveys a large number of different molecules simultaneously for a pre-defined chemical property, and enrich for molecules with the particular property. DNA and RNA sequences with versatile functions have been identified by in vitro selection experiments, but many basic questions remain to be answered about how these molecules achieve their functions. This dissertation first focuses on addressing a fundamental question regarding the molecular recognition properties of in vitro selected DNA sequences, namely whether negatively charged DNA sequences can be evolved to bind alkaline proteins with high specificity. We showed that DNA binders could be made, through carefully designed stringent in vitro selection, to discriminate different alkaline proteins. The focus of this dissertation is then shifted to in vitro evolution of an artificial genetic polymer called threose nucleic acid (TNA). TNA has been considered a potential RNA progenitor during early evolution of life on Earth. However, further experimental evidence to support TNA as a primordial genetic material is lacking. In this dissertation we demonstrated the capacity of TNA to form stable tertiary structure with specific ligand binding property, which suggests a possible role of TNA as a pre-RNA genetic polymer. Additionally, we discussed the challenges in in vitro evolution for TNA enzymes and developed the necessary methodology for future TNA enzyme evolution.

ContributorsYu, Hanyang (Author) / Chaput, John C (Thesis advisor) / Chen, Julian (Committee member) / Yan, Hao (Committee member) / Arizona State University (Publisher)

Created2013

Directed evolution of gp120 binding mutants of the lectin Cyanovirin-N

Description

Cyanovirin-N (CV-N) is a naturally occurring lectin originally isolated from the cyanobacteria Nostoc ellipsosporum. This 11 kDa lectin is 101 amino acids long with two binding sites, one at each end of the protein. CV-N specifically binds to terminal Manα1-2Manα motifs on the branched, high mannose Man9 and Man8 glycosylations…

Cyanovirin-N (CV-N) is a naturally occurring lectin originally isolated from the cyanobacteria Nostoc ellipsosporum. This 11 kDa lectin is 101 amino acids long with two binding sites, one at each end of the protein. CV-N specifically binds to terminal Manα1-2Manα motifs on the branched, high mannose Man9 and Man8 glycosylations found on enveloped viruses including Ebola, Influenza, and HIV. wt-CVN has micromolar binding to soluble Manα1-2Manα and also inhibits HIV entry at low nanomolar concentrations. CV-N's high affinity and specificity for Manα1-2Manα makes it an excellent lectin to study for its glycan-specific properties. The long-term aim of this project is to make a variety of mutant CV-Ns to specifically bind other glycan targets. Such a set of lectins may be used as screening reagents to identify biomarkers and other glycan motifs of interest. As proof of concept, a T7 phage display library was constructed using P51G-m4-CVN genes mutated at positions 41, 44, 52, 53, 56, 74, and 76 in binding Domain B. Five CV-N mutants were selected from the library and expressed in BL21(DE3) E. coli. Two of the mutants, SSDGLQQ-P51Gm4-CVN and AAGRLSK-P51Gm4-CVN, were sufficiently stable for characterization and were examined by CD, Tm, ELISA, and glycan array. Both proteins have CD minima at approximately 213 nm, indicating largely β-sheet structure, and have Tm values greater than 40°C. ELISA against gp120 and RNase B demonstrate both proteins' ability to bind high mannose glycans. To more specifically determine the binding specificity of each protein, AAGRLSK-P51Gm4-CVN, SSDGLQQ-P51Gm4-CVN, wt-CVN, and P51G-m4-CVN were sent to the Consortium for Functional Glycomics (CFG) for glycan array analysis. AAGRLSK-P51Gm4-CVN, wt-CVN, and P51G-m4-CVN, have identical specificities for high mannose glycans containing terminal Manα1-2Manα. SSDGLQQ-P51Gm4-CVN binds to terminal GlcNAcα1-4Gal motifs and a subgroup of high mannose glycans bound by P51G-m4-CVN. SSDGLQQ-wt-CVN was produced to restore anti-HIV activity and has a high nanomolar EC50 value compared to wt-CVN's low nanomolar activity. Overall, these experiments show that CV-N Domain B can be mutated and retain specificity identical to wt-CVN or acquire new glycan specificities. This first generation information can be used to produce glycan-specific lectins for a variety of applications.

ContributorsRuben, Melissa (Author) / Ghirlanda, Giovanna (Thesis advisor) / Allen, James (Committee member) / Wachter, Rebekka (Committee member) / Arizona State University (Publisher)

Created2013

Batch mode active learning for multimedia pattern recognition

Description

The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of humongous amounts of digital data (in the form of images, videos and text). This has expanded the possibility of solving real world problems using computational learning frameworks. However, while gathering a…

The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of humongous amounts of digital data (in the form of images, videos and text). This has expanded the possibility of solving real world problems using computational learning frameworks. However, while gathering a large amount of data is cheap and easy, annotating them with class labels is an expensive process in terms of time, labor and human expertise. This has paved the way for research in the field of active learning. Such algorithms automatically select the salient and exemplar instances from large quantities of unlabeled data and are effective in reducing human labeling effort in inducing classification models. To utilize the possible presence of multiple labeling agents, there have been attempts towards a batch mode form of active learning, where a batch of data instances is selected simultaneously for manual annotation. This dissertation is aimed at the development of novel batch mode active learning algorithms to reduce manual effort in training classification models in real world multimedia pattern recognition applications. Four major contributions are proposed in this work: $(i)$ a framework for dynamic batch mode active learning, where the batch size and the specific data instances to be queried are selected adaptively through a single formulation, based on the complexity of the data stream in question, $(ii)$ a batch mode active learning strategy for fuzzy label classification problems, where there is an inherent imprecision and vagueness in the class label definitions, $(iii)$ batch mode active learning algorithms based on convex relaxations of an NP-hard integer quadratic programming (IQP) problem, with guaranteed bounds on the solution quality and $(iv)$ an active matrix completion algorithm and its application to solve several variants of the active learning problem (transductive active learning, multi-label active learning, active feature acquisition and active learning for regression). These contributions are validated on the face recognition and facial expression recognition problems (which are commonly encountered in real world applications like robotics, security and assistive technology for the blind and the visually impaired) and also on collaborative filtering applications like movie recommendation.

ContributorsChakraborty, Shayok (Author) / Panchanathan, Sethuraman (Thesis advisor) / Balasubramanian, Vineeth N. (Committee member) / Li, Baoxin (Committee member) / Mittelmann, Hans (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)

Created2013

Modification of electron transfer proteins in the Chlamydomonas reinhardtii chloroplast for alternative fuel development

Description

There is a critical need for the development of clean and efficient energy sources. Hydrogen is being explored as a viable alternative to fuels in current use, many of which have limited availability and detrimental byproducts. Biological photo-production of H2 could provide a potential energy source directly manufactured from water…

There is a critical need for the development of clean and efficient energy sources. Hydrogen is being explored as a viable alternative to fuels in current use, many of which have limited availability and detrimental byproducts. Biological photo-production of H2 could provide a potential energy source directly manufactured from water and sunlight. As a part of the photosynthetic electron transport chain (PETC) of the green algae Chlamydomonas reinhardtii, water is split via Photosystem II (PSII) and the electrons flow through a series of electron transfer cofactors in cytochrome b6f, plastocyanin and Photosystem I (PSI). The terminal electron acceptor of PSI is ferredoxin, from which electrons may be used to reduce NADP+ for metabolic purposes. Concomitant production of a H+ gradient allows production of energy for the cell. Under certain conditions and using the endogenous hydrogenase, excess protons and electrons from ferredoxin may be converted to molecular hydrogen. In this work it is demonstrated both that certain mutations near the quinone electron transfer cofactor in PSI can speed up electron transfer through the PETC, and also that a native [FeFe]-hydrogenase can be expressed in the C. reinhardtii chloroplast. Taken together, these research findings form the foundation for the design of a PSI-hydrogenase fusion for the direct and continuous photo-production of hydrogen in vivo.

ContributorsReifschneider, Kiera (Author) / Redding, Kevin (Thesis advisor) / Fromme, Petra (Committee member) / Jones, Anne (Committee member) / Arizona State University (Publisher)

Created2013

Study of ribosomes having modifications in the peptidyltransferase center using non-alpha-L-amino acids and synthesis and biological evaluation of topopyrones

Description

The ribosome is a ribozyme and central to the biosynthesis of proteins in all organisms. It has a strong bias against non-alpha-L-amino acids, such as alpha-D-amino acids and beta-amino acids. Additionally, the ribosome is only able to incorporate one amino acid in response to one codon. It has been demonstrated…

The ribosome is a ribozyme and central to the biosynthesis of proteins in all organisms. It has a strong bias against non-alpha-L-amino acids, such as alpha-D-amino acids and beta-amino acids. Additionally, the ribosome is only able to incorporate one amino acid in response to one codon. It has been demonstrated that reengineering of the peptidyltransferase center (PTC) of the ribosome enabled the incorporation of both alpha-D-amino acids and beta-amino acids into full length protein. Described in Chapter 2 are five modified ribosomes having modifications in the peptidyltrasnferase center in the 23S rRNA. These modified ribosomes successfully incorporated five different beta-amino acids (2.1 - 2.5) into E. coli dihydrofolate reductase (DHFR). The second project (Chapter 3) focused on the study of the modified ribosomes facilitating the incorporation of the dipeptide glycylphenylalanine (3.25) and fluorescent dipeptidomimetic 3.26 into DHFR. These ribosomes also had modifications in the peptidyltransferase center in the 23S rRNA of the 50S ribosomal subunit. The modified DHFRs having beta-amino acids 2.3 and 2.5, dipeptide glycylphenylalanine (3.25) and dipeptidomimetic 3.26 were successfully characterized by the MALDI-MS analysis of the peptide fragments produced by "in-gel" trypsin digestion of the modified proteins. The fluorescent spectra of the dipeptidomimetic 3.26 and modified DHFR having fluorescent dipeptidomimetic 3.26 were also measured. The type I and II DNA topoisomerases have been firmly established as effective molecular targets for many antitumor drugs. A "classical" topoisomerase I or II poison acts by misaligning the free hydroxyl group of the sugar moiety of DNA and preventing the reverse transesterfication reaction to religate DNA. There have been only two classes of compounds, saintopin and topopyrones, reported as dual topoisomerase I and II poisons. Chapter 4 describes the synthesis and biological evaluation of topopyrones. Compound 4.10, employed at 20 ÂµM, was as efficient as 0.5 uM camptothecin, a potent topoisomerase I poison, in stabilizing the covalent binary complex (~30%). When compared with a known topoisomerase II poison, etoposide (at 0.5 uM), topopyorone 4.10 produced similar levels of stabilized DNA-enzyme binary complex (~34%) at 5 uM concentration.

ContributorsMaini, Rumit (Author) / Hecht, Sidney M. (Thesis advisor) / Gould, Ian (Committee member) / Yan, Hao (Committee member) / Arizona State University (Publisher)

Created2013

Exploring the impact of varying levels of augmented reality to teach probability and sampling with a mobile device

Description

Statistics is taught at every level of education, yet teachers often have to assume their students have no knowledge of statistics and start from scratch each time they set out to teach statistics. The motivation for this experimental study comes from interest in exploring educational applications of augmented reality (AR)…

Statistics is taught at every level of education, yet teachers often have to assume their students have no knowledge of statistics and start from scratch each time they set out to teach statistics. The motivation for this experimental study comes from interest in exploring educational applications of augmented reality (AR) delivered via mobile technology that could potentially provide rich, contextualized learning for understanding concepts related to statistics education. This study examined the effects of AR experiences for learning basic statistical concepts. Using a 3 x 2 research design, this study compared learning gains of 252 undergraduate and graduate students from a pre- and posttest given before and after interacting with one of three types of augmented reality experiences, a high AR experience (interacting with three dimensional images coupled with movement through a physical space), a low AR experience (interacting with three dimensional images without movement), or no AR experience (two dimensional images without movement). Two levels of collaboration (pairs and no pairs) were also included. Additionally, student perceptions toward collaboration opportunities and engagement were compared across the six treatment conditions. Other demographic information collected included the students' previous statistics experience, as well as their comfort level in using mobile devices. The moderating variables included prior knowledge (high, average, and low) as measured by the student's pretest score. Taking into account prior knowledge, students with low prior knowledge assigned to either high or low AR experience had statistically significant higher learning gains than those assigned to a no AR experience. On the other hand, the results showed no statistical significance between students assigned to work individually versus in pairs. Students assigned to both high and low AR experience perceived a statistically significant higher level of engagement than their no AR counterparts. Students with low prior knowledge benefited the most from the high AR condition in learning gains. Overall, the AR application did well for providing a hands-on experience working with statistical data. Further research on AR and its relationship to spatial cognition, situated learning, high order skill development, performance support, and other classroom applications for learning is still needed.

ContributorsConley, Quincy (Author) / Atkinson, Robert K (Thesis advisor) / Nguyen, Frank (Committee member) / Nelson, Brian C (Committee member) / Arizona State University (Publisher)

Created2013

Novel strategies for producing proteins with non-proteinogenic amino acids

Description

The biological and chemical diversity of protein structure and function can be greatly expanded by position-specific incorporation of non-natural amino acids bearing a variety of functional groups. Non-cognate amino acids can be incorporated into proteins at specific sites by using orthogonal aminoacyl-tRNA synthetase/tRNA pairs in conjunction with nonsense, rare, or…

The biological and chemical diversity of protein structure and function can be greatly expanded by position-specific incorporation of non-natural amino acids bearing a variety of functional groups. Non-cognate amino acids can be incorporated into proteins at specific sites by using orthogonal aminoacyl-tRNA synthetase/tRNA pairs in conjunction with nonsense, rare, or 4-bp codons. There has been considerable progress in developing new types of amino acids, in identifying novel methods of tRNA aminoacylation, and in expanding the genetic code to direct their position. Chemical aminoacylation of tRNAs is accomplished by acylation and ligation of a dinucleotide (pdCpA) to the 3'-terminus of truncated tRNA. This strategy allows the incorporation of a wide range of natural and unnatural amino acids into pre-determined sites, thereby facilitating the study of structure-function relationships in proteins and allowing the investigation of their biological, biochemical and biophysical properties. Described in Chapter 1 is the current methodology for synthesizing aminoacylated suppressor tRNAs. Aminoacylated suppressor tRNACUAs are typically prepared by linking pre-aminoacylated dinucleotides (aminoacyl-pdCpAs) to 74 nucleotide (nt) truncated tRNAs (tRNA-COH) via a T4 RNA ligase mediated reaction. Alternatively, there is another route outlined in Chapter 1 that utilizes a different pre-aminoacylated dinucleotide, AppA. This dinucleotide has been shown to be a suitable substrate for T4 RNA ligase mediated coupling with abbreviated tRNA-COHs for production of 76 nt aminoacyl-tRNACUAs. The synthesized suppressor tRNAs have been shown to participate in protein synthesis in vitro, in an S30 (E. coli) coupled transcription-translation system in which there is a UAG codon in the mRNA at the position corresponding to Val10. Chapter 2 describes the synthesis of two non-proteinogenic amino acids, L-thiothreonine and L-allo-thiothreonine, and their incorporation into predetermined positions of a catalytically competent dihydrofolate reductase (DHFR) analogue lacking cysteine. Here, the elaborated proteins were site-specifically derivitized with a fluorophore at the thiothreonine residue. The synthesis and incorporation of phosphorotyrosine derivatives into DHFR is illustrated in Chapter 3. Three different phosphorylated tyrosine derivatives were prepared: bis-nitrobenzylphosphoro-L-tyrosine, nitrobenzylphosphoro-L-tyrosine, and phosphoro-L-tyrosine. Their ability to participate in a protein synthesis system was also evaluated.

ContributorsNangreave, Ryan Christopher (Author) / Hecht, Sidney M. (Thesis advisor) / Yan, Hao (Committee member) / Gould, Ian (Committee member) / Arizona State University (Publisher)

Created2013

Advancing biomedical named entity recognition with multivariate feature selection and semantically motivated features

Description

Automating aspects of biocuration through biomedical information extraction could significantly impact biomedical research by enabling greater biocuration throughput and improving the feasibility of a wider scope. An important step in biomedical information extraction systems is named entity recognition (NER), where mentions of entities such as proteins and diseases are located…

Automating aspects of biocuration through biomedical information extraction could significantly impact biomedical research by enabling greater biocuration throughput and improving the feasibility of a wider scope. An important step in biomedical information extraction systems is named entity recognition (NER), where mentions of entities such as proteins and diseases are located within natural-language text and their semantic type is determined. This step is critical for later tasks in an information extraction pipeline, including normalization and relationship extraction. BANNER is a benchmark biomedical NER system using linear-chain conditional random fields and the rich feature set approach. A case study with BANNER locating genes and proteins in biomedical literature is described. The first corpus for disease NER adequate for use as training data is introduced, and employed in a case study of disease NER. The first corpus locating adverse drug reactions (ADRs) in user posts to a health-related social website is also described, and a system to locate and identify ADRs in social media text is created and evaluated. The rich feature set approach to creating NER feature sets is argued to be subject to diminishing returns, implying that additional improvements may require more sophisticated methods for creating the feature set. This motivates the first application of multivariate feature selection with filters and false discovery rate analysis to biomedical NER, resulting in a feature set at least 3 orders of magnitude smaller than the set created by the rich feature set approach. Finally, two novel approaches to NER by modeling the semantics of token sequences are introduced. The first method focuses on the sequence content by using language models to determine whether a sequence resembles entries in a lexicon of entity names or text from an unlabeled corpus more closely. The second method models the distributional semantics of token sequences, determining the similarity between a potential mention and the token sequences from the training data by analyzing the contexts where each sequence appears in a large unlabeled corpus. The second method is shown to improve the performance of BANNER on multiple data sets.

ContributorsLeaman, James Robert (Author) / Gonzalez, Graciela (Thesis advisor) / Baral, Chitta (Thesis advisor) / Cohen, Kevin B (Committee member) / Liu, Huan (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)

Created2013

Filtering by