Search Content

Targeted proteomics studies: design, development and translation of mass spectrometric immunoassays for diabetes and kidney disease

Description

In an effort to begin validating the large number of discovered candidate biomarkers, proteomics is beginning to shift from shotgun proteomic experiments towards targeted proteomic approaches that provide solutions to automation and economic concerns. Such approaches to validate biomarkers necessitate the mass spectrometric analysis of hundreds to thousands of human…

In an effort to begin validating the large number of discovered candidate biomarkers, proteomics is beginning to shift from shotgun proteomic experiments towards targeted proteomic approaches that provide solutions to automation and economic concerns. Such approaches to validate biomarkers necessitate the mass spectrometric analysis of hundreds to thousands of human samples. As this takes place, a serendipitous opportunity has become evident. By the virtue that as one narrows the focus towards "single" protein targets (instead of entire proteomes) using pan-antibody-based enrichment techniques, a discovery science has emerged, so to speak. This is due to the largely unknown context in which "single" proteins exist in blood (i.e. polymorphisms, transcript variants, and posttranslational modifications) and hence, targeted proteomics has applications for established biomarkers. Furthermore, besides protein heterogeneity accounting for interferences with conventional immunometric platforms, it is becoming evident that this formerly hidden dimension of structural information also contains rich-pathobiological information. Consequently, targeted proteomics studies that aim to ascertain a protein's genuine presentation within disease- stratified populations and serve as a stepping-stone within a biomarker translational pipeline are of clinical interest. Roughly 128 million Americans are pre-diabetic, diabetic, and/or have kidney disease and public and private spending for treating these diseases is in the hundreds of billions of dollars. In an effort to create new solutions for the early detection and management of these conditions, described herein is the design, development, and translation of mass spectrometric immunoassays targeted towards diabetes and kidney disease. Population proteomics experiments were performed for the following clinically relevant proteins: insulin, C-peptide, RANTES, and parathyroid hormone. At least thirty-eight protein isoforms were detected. Besides the numerous disease correlations confronted within the disease-stratified cohorts, certain isoforms also appeared to be causally related to the underlying pathophysiology and/or have therapeutic implications. Technical advancements include multiplexed isoform quantification as well a "dual- extraction" methodology for eliminating non-specific proteins while simultaneously validating isoforms. Industrial efforts towards widespread clinical adoption are also described. Consequently, this work lays a foundation for the translation of mass spectrometric immunoassays into the clinical arena and simultaneously presents the most recent advancements concerning the mass spectrometric immunoassay approach.

ContributorsOran, Paul (Author) / Nelson, Randall (Thesis advisor) / Hayes, Mark (Thesis advisor) / Ros, Alexandra (Committee member) / Williams, Peter (Committee member) / Arizona State University (Publisher)

Created2011

Statistical signal processing of ESI-TOF-MS for biomarker discovery

Description

Signal processing techniques have been used extensively in many engineering problems and in recent years its application has extended to non-traditional research fields such as biological systems. Many of these applications require extraction of a signal or parameter of interest from degraded measurements. One such application is mass spectrometry immunoassay…

Signal processing techniques have been used extensively in many engineering problems and in recent years its application has extended to non-traditional research fields such as biological systems. Many of these applications require extraction of a signal or parameter of interest from degraded measurements. One such application is mass spectrometry immunoassay (MSIA) which has been one of the primary methods of biomarker discovery techniques. MSIA analyzes protein molecules as potential biomarkers using time of flight mass spectrometry (TOF-MS). Peak detection in TOF-MS is important for biomarker analysis and many other MS related application. Though many peak detection algorithms exist, most of them are based on heuristics models. One of the ways of detecting signal peaks is by deploying stochastic models of the signal and noise observations. Likelihood ratio test (LRT) detector, based on the Neyman-Pearson (NP) lemma, is an uniformly most powerful test to decision making in the form of a hypothesis test. The primary goal of this dissertation is to develop signal and noise models for the electrospray ionization (ESI) TOF-MS data. A new method is proposed for developing the signal model by employing first principles calculations based on device physics and molecular properties. The noise model is developed by analyzing MS data from careful experiments in the ESI mass spectrometer. A non-flat baseline in MS data is common. The reasons behind the formation of this baseline has not been fully comprehended. A new signal model explaining the presence of baseline is proposed, though detailed experiments are needed to further substantiate the model assumptions. Signal detection schemes based on these signal and noise models are proposed. A maximum likelihood (ML) method is introduced for estimating the signal peak amplitudes. The performance of the detection methods and ML estimation are evaluated with Monte Carlo simulation which shows promising results. An application of these methods is proposed for fractional abundance calculation for biomarker analysis, which is mathematically robust and fundamentally different than the current algorithms. Biomarker panels for type 2 diabetes and cardiovascular disease are analyzed using existing MS analysis algorithms. Finally, a support vector machine based multi-classification algorithm is developed for evaluating the biomarkers' effectiveness in discriminating type 2 diabetes and cardiovascular diseases and is shown to perform better than a linear discriminant analysis based classifier.

ContributorsBuddi, Sai (Author) / Taylor, Thomas (Thesis advisor) / Cochran, Douglas (Thesis advisor) / Nelson, Randall (Committee member) / Duman, Tolga (Committee member) / Arizona State University (Publisher)

Created2012

Structural characterization of potential cancer biomarker proteins

Description

Cancer claims hundreds of thousands of lives every year in US alone. Finding ways for early detection of cancer onset is crucial for better management and treatment of cancer. Thus, biomarkers especially protein biomarkers, being the functional units which reflect dynamic physiological changes, need to be discovered. Though important, there…

Cancer claims hundreds of thousands of lives every year in US alone. Finding ways for early detection of cancer onset is crucial for better management and treatment of cancer. Thus, biomarkers especially protein biomarkers, being the functional units which reflect dynamic physiological changes, need to be discovered. Though important, there are only a few approved protein cancer biomarkers till date. To accelerate this process, fast, comprehensive and affordable assays are required which can be applied to large population studies. For this, these assays should be able to comprehensively characterize and explore the molecular diversity of nominally "single" proteins across populations. This information is usually unavailable with commonly used immunoassays such as ELISA (enzyme linked immunosorbent assay) which either ignore protein microheterogeneity, or are confounded by it. To this end, mass spectrometric immuno assays (MSIA) for three different human plasma proteins have been developed. These proteins viz. IGF-1, hemopexin and tetranectin have been found in reported literature to show correlations with many diseases along with several carcinomas. Developed assays were used to extract entire proteins from plasma samples and subsequently analyzed on mass spectrometric platforms. Matrix assisted laser desorption ionization (MALDI) and electrospray ionization (ESI) mass spectrometric techniques where used due to their availability and suitability for the analysis. This resulted in visibility of different structural forms of these proteins showing their structural micro-heterogeneity which is invisible to commonly used immunoassays. These assays are fast, comprehensive and can be applied in large sample studies to analyze proteins for biomarker discovery.

ContributorsRai, Samita (Author) / Nelson, Randall (Thesis advisor) / Hayes, Mark (Thesis advisor) / Borges, Chad (Committee member) / Ros, Alexandra (Committee member) / Arizona State University (Publisher)

Created2012

Differential Gene Expression in Type II Diabetes

Description

This research project investigated known and novel differential genetic variants and their associated molecular pathways involved in Type II diabetes mellitus for the purpose of improving diagnosis and treatment methods. The goal of this investigation was to 1) identify the genetic variants and SNPs in Type II diabetes to develo…

This research project investigated known and novel differential genetic variants and their associated molecular pathways involved in Type II diabetes mellitus for the purpose of improving diagnosis and treatment methods. The goal of this investigation was to 1) identify the genetic variants and SNPs in Type II diabetes to develop a gene regulatory pathway, and 2) utilize this pathway to determine suitable drug therapeutics for prevention and treatment. Using a Gene Set Enrichment Analysis (GSEA), a set of 1000 gene identifiers from a Mayo Clinic database was analyzed to determine the most significant genetic variants related to insulin signaling pathways involved in Type II Diabetes. The following genes were identified: NRAS, KRAS, PIK3CA, PDE3B, TSC1, AKT3, SOS1, NEU1, PRKAA2, AMPK, and ACC. In an extensive literature review and cross-analysis with Kegg and Reactome pathway databases, novel SNPs located on these gene variants were identified and used to determine suitable drug therapeutics for treatment. Overall, understanding how genetic mutations affect target gene function related to Type II Diabetes disease pathology is crucial to the development of effective diagnosis and treatment. This project provides new insight into the molecular basis of the Type II Diabetes, serving to help untangle the regulatory complexity of the disease and aid in the advancement of diagnosis and treatment. Keywords: Type II Diabetes mellitus, Gene Set Enrichment Analysis, genetic variants, KEGG Insulin Pathway, gene-regulatory pathway

ContributorsBucklin, Lindsay (Co-author) / Davis, Vanessa (Co-author) / Holechek, Susan (Thesis director) / Wang, Junwen (Committee member) / Nyarige, Verah (Committee member) / School of Human Evolution & Social Change (Contributor) / School of Life Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2019-05

Differential Gene Expression in Type II Diabetes

Description

This research project investigated known and novel differential genetic variants and their associated molecular pathways involved in Type II diabetes mellitus for the purpose of improving diagnosis and treatment methods. The goal of this investigation was to 1) identify the genetic variants and SNPs in Type II diabetes to develo…

This research project investigated known and novel differential genetic variants and their associated molecular pathways involved in Type II diabetes mellitus for the purpose of improving diagnosis and treatment methods. The goal of this investigation was to 1) identify the genetic variants and SNPs in Type II diabetes to develop a gene regulatory pathway, and 2) utilize this pathway to determine suitable drug therapeutics for prevention and treatment. Using a Gene Set Enrichment Analysis (GSEA), a set of 1000 gene identifiers from a Mayo Clinic database was analyzed to determine the most significant genetic variants related to insulin signaling pathways involved in Type II Diabetes. The following genes were identified: NRAS, KRAS, PIK3CA, PDE3B, TSC1, AKT3, SOS1, NEU1, PRKAA2, AMPK, and ACC. In an extensive literature review and cross-analysis with Kegg and Reactome pathway databases, novel SNPs located on these gene variants were identified and used to determine suitable drug therapeutics for treatment. Overall, understanding how genetic mutations affect target gene function related to Type II Diabetes disease pathology is crucial to the development of effective diagnosis and treatment. This project provides new insight into the molecular basis of the Type II Diabetes, serving to help untangle the regulatory complexity of the disease and aid in the advancement of diagnosis and treatment.

ContributorsDavis, Vanessa Brooke (Co-author) / Bucklin, Lindsay (Co-author) / Holechek, Susan (Thesis director) / Wang, Junwen (Committee member) / School of Molecular Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2019-05

Novel Bioinformatics Methods for Co-expression Analysis of Single Cell RNA Sequencing and Circular RNA Sequencing Time Series Data

Description

High throughput transcriptome data analysis like Single-cell Ribonucleic Acid sequencing (scRNA-seq) and Circular Ribonucleic Acid (circRNA) data have made significant breakthroughs, especially in cancer genomics. Analysis of transcriptome time series data is core in identifying time point(s) where drastic changes in gene transcription are associated with homeostatic to non-homeostatic cellular…

High throughput transcriptome data analysis like Single-cell Ribonucleic Acid sequencing (scRNA-seq) and Circular Ribonucleic Acid (circRNA) data have made significant breakthroughs, especially in cancer genomics. Analysis of transcriptome time series data is core in identifying time point(s) where drastic changes in gene transcription are associated with homeostatic to non-homeostatic cellular transition (tipping points). In Chapter 2 of this dissertation, I present a novel cell-type specific and co-expression-based tipping point detection method to identify target gene (TG) versus transcription factor (TF) pairs whose differential co-expression across time points drive biological changes in different cell types and the time point when these changes are observed. This method was applied to scRNA-seq data sets from a SARS-CoV-2 study (18 time points), a human cerebellum development study (9 time points), and a lung injury study (18 time points). Similarly, leveraging transcriptome data across treatment time points, I developed methodologies to identify treatment-induced and cell-type specific differentially co-expressed pairs (DCEPs). In part one of Chapter 3, I presented a pipeline that used a series of statistical tests to detect DCEPs. This method was applied to scRNA-seq data of patients with non-small cell lung cancer (NSCLC) sequenced across cancer treatment times. However, this pipeline does not account for correlations among multiple single cells from the same sample and correlations among multiple samples from the same patient. In Part 2 of Chapter 3, I presented a solution to this problem using a mixed-effect model. In Chapter 4, I present a summary of my work that focused on the cross-species analysis of circRNA transcriptome time series data. I compared circRNA profiles in neonatal pig and mouse hearts, identified orthologous circRNAs, and discussed regulation mechanisms of cardiomyocyte proliferation and myocardial regeneration conserved between mouse and pig at different time points.

ContributorsNyarige, Verah Mocheche (Author) / Liu, Li (Thesis advisor) / Wang, Junwen (Thesis advisor) / Dinu, Valentin (Committee member) / Arizona State University (Publisher)

Created2022

Mining Associations between MRI Morphometry Measurements and Beta-Amyloid/tau Burden

Description

Beta-Amyloid(Aβ) plaques and tau protein tangles in the brain are now widely recognized as the defining hallmarks of Alzheimer’s disease (AD), followed by structural atrophy detectable on brain magnetic resonance imaging (MRI) scans. However, current methods to detect Aβ/tau pathology are either invasive (lumbar puncture) or quite costly and not…

Beta-Amyloid(Aβ) plaques and tau protein tangles in the brain are now widely recognized as the defining hallmarks of Alzheimer’s disease (AD), followed by structural atrophy detectable on brain magnetic resonance imaging (MRI) scans. However, current methods to detect Aβ/tau pathology are either invasive (lumbar puncture) or quite costly and not widely available (positron emission tomography (PET)). And one of the particular neurodegenerative regions is the hippocampus to which the influence of Aβ/tau on has been one of the research projects focuses in the AD pathophysiological progress. In this dissertation, I proposed three novel machine learning and statistical models to examine subtle aspects of the hippocampal morphometry from MRI that are associated with Aβ /tau burden in the brain, measured using PET images. The first model is a novel unsupervised feature reduction model to generate a low-dimensional representation of hippocampal morphometry for each individual subject, which has superior performance in predicting Aβ/tau burden in the brain. The second one is an efficient federated group lasso model to identify the hippocampal subregions where atrophy is strongly associated with abnormal Aβ/Tau. The last one is a federated model for imaging genetics, which can identify genetic and transcriptomic influences on hippocampal morphometry. Finally, I stated the results of these three models that have been published or submitted to peer-reviewed conferences and journals.

ContributorsWu, Jianfeng (Author) / Wang, Yalin (Thesis advisor) / Li, Baoxin (Committee member) / Liang, Jianming (Committee member) / Wang, Junwen (Committee member) / Wu, Teresa (Committee member) / Arizona State University (Publisher)

Created2022

Imaging mass spectrometry of biomolecules using massive cluster impact

Description

Massive glycerol cluster ions with many charges (~ 106 Da, ~ ±100 charges) have been generated by electrospray to bombard biomolecules and biological sample surfaces. The low impact energy per nucleon facilitates intact sputtering and ionization of biomolecules which can be subsequently imaged. Various lipids, peptides and proteins have been…

Massive glycerol cluster ions with many charges (~ 106 Da, ~ ±100 charges) have been generated by electrospray to bombard biomolecules and biological sample surfaces. The low impact energy per nucleon facilitates intact sputtering and ionization of biomolecules which can be subsequently imaged. Various lipids, peptides and proteins have been studied. The primary cluster ion source has been coupled with an ion-microscope imaging mass spectrometer (TRIFT-1, Physical Electronics). A lateral resolution of ~3µm has been demonstrated, which is acceptable for sub-cellular imaging of animal cells (e.g. single cancer cell imaging in early diagnosis). Since the available amount of target molecules per pixel is limited in biological samples, the measurement of useful ion yields (ratio of detected molecular ion counts to the sample molecules sputtered) is important to determine whether enough ion counts per pixel can be obtained. The useful ion yields of several lipids and peptides are in the 1-3×10-5 range. A 3×3 µm2lipid bilayer can produce ~260 counts/pixel for a meaningful 3×3 µm2 pixel ion image. This method can probably be used in cell imaging in the future, when there is a change in the lipid contents of the cell membrane (e.g. cancer cells vs. normal cells).

ContributorsZhang, Jitao (Author) / Williams, Peter (Thesis advisor) / Hayes, Mark (Committee member) / Nelson, Randall (Committee member) / Arizona State University (Publisher)

Created2015

Hyphenation of a microfluidic platform with MALDI-TOF mass spectrometry for single cell analysis

Description

Cell heterogeneity is widely present in the biological world and exists even in an isogenic population. Resolving the protein heterogeneity at the single cell level is of enormous biological and clinical relevance. However, single cell protein analysis has proven to be challenging due to extremely low amount of protein in…

Cell heterogeneity is widely present in the biological world and exists even in an isogenic population. Resolving the protein heterogeneity at the single cell level is of enormous biological and clinical relevance. However, single cell protein analysis has proven to be challenging due to extremely low amount of protein in a single cell and the huge complexity of proteome. This requires appropriate sampling and sensitive detection techniques. Here, a new approach, microfluidics combined with MALDI-TOF mass spectrometry was brought forward, for the analysis of proteins in single cells. The detection sensitivity of peptides as low as 300 molecules and of proteins as low as 10^6 molecules has been demonstrated. Furthermore, an immunoassay was successfully integrated in the microfluidic device for capturing the proteins of interest and further identifying them by subsequent enzymatic digestion. Moreover, an improved microfluidic platform was designed with separate chambers and valves, allowing the absolute quantification by employing iTRAQ tags or an isotopically labeled peptide. The study was further extended to analyze a protein in MCF-7 cell lysate. The approach capable of identifying and quantifying protein molecules in MCF-7 cells is promising for future proteomic studies at the single cell level.

ContributorsYang, Mian (Author) / Ros, Alexandra (Thesis advisor) / Hayes, Mark (Committee member) / Nelson, Randall (Committee member) / Arizona State University (Publisher)

Created2016

Statistical Methods for Analysis of Genomic Data with Applications in Oncology

Description

This dissertation presents three novel algorithms with real-world applications to genomic oncology. While the methodologies presented here were all developed to overcome various challenges associated with the adoption of high throughput genomic data in clinical oncology, they can be used in other domains as well. First, a network informed feature…

This dissertation presents three novel algorithms with real-world applications to genomic oncology. While the methodologies presented here were all developed to overcome various challenges associated with the adoption of high throughput genomic data in clinical oncology, they can be used in other domains as well. First, a network informed feature ranking algorithm is presented, which shows a significant increase in ability to select true predictive features from simulated data sets when compared to other state of the art graphical feature ranking methods. The methodology also shows an increased ability to predict pathological complete response to preoperative chemotherapy from genomic sequencing data of breast cancer patients utilizing domain knowledge from protein-protein interaction networks. Second, an algorithm that overcomes population biases inherent in the use of a human reference genome developed primarily from European populations is presented to classify microsatellite instability (MSI) status from next-generation-sequencing (NGS) data. The methodology significantly increases the accuracy of MSI status prediction in African and African American ancestries. Finally, a single variable model is presented to capture the bimodality inherent in genomic data stemming from heterogeneous diseases. This model shows improvements over other parametric models in the measurements of receiver-operator characteristic (ROC) curves for bimodal data. The model is used to estimate ROC curves for heterogeneous biomarkers in a dataset containing breast cancer and cancer-free specimen.

ContributorsSaul, Michelle (Author) / Dinu, Valentin (Thesis advisor) / Liu, Li (Committee member) / Wang, Junwen (Committee member) / Arizona State University (Publisher)

Created2021