Matching Items (124)
134524-Thumbnail Image.png
Description
With the rising data output and falling costs of Next Generation Sequencing technologies, research into data compression is crucial to maintaining storage efficiency and costs. High throughput sequencers such as the HiSeqX Ten can produce up to 1.8 terabases of data per run, and such large storage demands are even

With the rising data output and falling costs of Next Generation Sequencing technologies, research into data compression is crucial to maintaining storage efficiency and costs. High throughput sequencers such as the HiSeqX Ten can produce up to 1.8 terabases of data per run, and such large storage demands are even more important to consider for institutions that rely on their own servers rather than large data centers (cloud storage)1. Compression algorithms aim to reduce the amount of space taken up by large genomic datasets by encoding the most frequently occurring symbols with the shortest bit codewords and by changing the order of the data to make it easier to encode. Depending on the probability distribution of the symbols in the dataset or the structure of the data, choosing the wrong algorithm could result in a compressed file larger than the original or a poorly compressed file that results in a waste of time and space2. To test efficiency among compression algorithms for each file type, 37 open-source compression algorithms were used to compress six types of genomic datasets (FASTA, VCF, BCF, GFF, GTF, and SAM) and evaluated on compression speed, decompression speed, compression ratio, and file size using the benchmark test lzbench. Compressors that outpreformed the popular bioinformatics compressor Gzip (zlib -6) were evaluated against one another by ratio and speed for each file type and across the geometric means of all file types. Compressors that exhibited fast compression and decompression speeds were also evaluated by transmission time through variable speed internet pipes in scenarios where the file was compressed only once or compressed multiple times.
ContributorsHowell, Abigail (Author) / Cartwright, Reed (Thesis director) / Wilson Sayres, Melissa (Committee member) / Taylor, Jay (Committee member) / Barrett, The Honors College (Contributor)
Created2017-05
135371-Thumbnail Image.png
Description
Almost every form of cancer deregulates the expression and activity of anabolic glycosyltransferase (GT) enzymes, which incorporate particular monosaccharides in a donor acceptor as well as linkage- and anomer-specific manner to assemble complex and diverse glycans that significantly affect numerous cellular events, including tumorigenesis and metastasis. Because glycosylation is not

Almost every form of cancer deregulates the expression and activity of anabolic glycosyltransferase (GT) enzymes, which incorporate particular monosaccharides in a donor acceptor as well as linkage- and anomer-specific manner to assemble complex and diverse glycans that significantly affect numerous cellular events, including tumorigenesis and metastasis. Because glycosylation is not template-driven, GT deregulation yields heterogeneous arrays of aberrant intact glycan products, some in undetectable quantities in clinical bio-fluids (e.g., blood plasma). Numerous glycan features (e.g., 6 sialylation, β-1,6-branching, and core fucosylation) stem from approximately 25 glycan “nodes:” unique linkage specific monosaccharides at particular glycan branch points that collectively confer distinguishing features upon glycan products. For each node, changes in normalized abundance (Figure 1) may serve as nearly 1:1 surrogate measure of activity for culpable GTs and may correlate with particular stages of carcinogenesis. Complementary to traditional top down glycomics, the novel bottom-up technique applied herein condenses each glycan node and feature into a single analytical signal, quantified by two GC-MS instruments: GCT (time-of-flight analyzer) and GCMSD (transmission quadrupole analyzers). Bottom-up analysis of stage 3 and 4 breast cancer cases revealed better overall precision for GCMSD yet comparable clinical performance of both GC MS instruments and identified two downregulated glycan nodes as excellent breast cancer biomarker candidates: t-Gal and 4,6-GlcNAc (ROC AUC ≈ 0.80, p < 0.05). Resulting from the activity of multiple GTs, t-Gal had the highest ROC AUC (0.88) and lowest ROC p‑value (0.001) among all analyzed nodes. Representing core-fucosylation, glycan node 4,6-GlcNAc is a nearly 1:1 molecular surrogate for the activity of α-(1,6)-fucosyltransferase—a potential target for cancer therapy. To validate these results, future projects can analyze larger sample sets, find correlations between breast cancer stage and changes in t-Gal and 4,6-GlcNAc levels, gauge the specificity of these nodes for breast cancer and their potential role in other cancer types, and develop clinical tests for reliable breast cancer diagnosis and treatment monitoring based on t-Gal and 4,6-GlcNAc.
ContributorsZaare, Sahba (Author) / Borges, Chad (Thesis director) / LaBaer, Joshua (Committee member) / School of Molecular Sciences (Contributor) / Barrett, The Honors College (Contributor)
Created2016-05
135440-Thumbnail Image.png
Description
Many bacteria actively import environmental DNA and incorporate it into their genomes. This behavior, referred to as transformation, has been described in many species from diverse taxonomic backgrounds. Transformation is expected to carry some selective advantages similar to those postulated for meiotic sex in eukaryotes. However, the accumulation of loss-of-function

Many bacteria actively import environmental DNA and incorporate it into their genomes. This behavior, referred to as transformation, has been described in many species from diverse taxonomic backgrounds. Transformation is expected to carry some selective advantages similar to those postulated for meiotic sex in eukaryotes. However, the accumulation of loss-of-function alleles at transformation loci and an increased mutational load from recombining with DNA from dead cells create additional costs to transformation. These costs have been shown to outweigh many of the benefits of recombination under a variety of likely parameters. We investigate an additional proposed benefit of sexual recombination, the Red Queen hypothesis, as it relates to bacterial transformation. Here we describe a computational model showing that host-pathogen coevolution may provide a large selective benefit to transformation and allow transforming cells to invade an environment dominated by otherwise equal non-transformers. Furthermore, we observe that host-pathogen dynamics cause the selection pressure on transformation to vary extensively in time, explaining the tight regulation and wide variety of rates observed in naturally competent bacteria. Host-pathogen dynamics may explain the evolution and maintenance of natural competence despite its associated costs.
ContributorsPalmer, Nathan David (Author) / Cartwright, Reed (Thesis director) / Wang, Xuan (Committee member) / Sievert, Chris (Committee member) / School of Life Sciences (Contributor) / Barrett, The Honors College (Contributor)
Created2016-05
135360-Thumbnail Image.png
Description
Aberrant glycosylation has been shown to be linked to specific cancers, and using this idea, it was proposed that the levels of glycans in the blood could predict stage I adenocarcinoma. To track this glycosylation, glycan were broken down into glycan nodes via methylation analysis. This analysis utilized information from

Aberrant glycosylation has been shown to be linked to specific cancers, and using this idea, it was proposed that the levels of glycans in the blood could predict stage I adenocarcinoma. To track this glycosylation, glycan were broken down into glycan nodes via methylation analysis. This analysis utilized information from N-, O-, and lipid linked glycans detected from gas chromatography-mass spectrometry. The resulting glycan node-ratios represent the initial quantitative data that were used in this experiment.
For this experiment, two Sets of 50 µl blood plasma samples were provided by NYU Medical School. These samples were then analyzed by Dr. Borges’s lab so that they contained normalized biomarker levels from patients with stage 1 adenocarcinoma and control patients with matched age, smoking status, and gender were examined. An ROC curve was constructed under individual and paired conditions and AUC calculated in Wolfram Mathematica 10.2. Methods such as increasing size of training set, using hard vs. soft margins, and processing biomarkers together and individually were used in order to increase the AUC. Using a soft margin for this particular data set was proved to be most useful compared to the initial set hard margin, raising the AUC from 0.6013 to 0.6585. In regards to which biomarkers yielded the better value, 6-Glc/6-Man and 3,6-Gal glycan node ratios had the best with 0.7687 AUC and a sensitivity of .7684 and specificity of .6051. While this is not enough accuracy to become a primary diagnostic tool for diagnosing stage I adenocarcinoma, the methods examined in the paper should be evaluated further. . By comparison, the current clinical standard blood test for prostate cancer that has an AUC of only 0.67.
ContributorsDe Jesus, Celine Spicer (Author) / Taylor, Thomas (Thesis director) / Borges, Chad (Committee member) / School of Mathematical and Statistical Sciences (Contributor) / School of Molecular Sciences (Contributor) / Barrett, The Honors College (Contributor)
Created2016-05
135454-Thumbnail Image.png
Description
Mammary gland development in humans during puberty involves the enlargement of breast tissue, but this is not true in non-human primates. To identify potential causes of this difference, I examined variation in substitution rates across genes related to mammary development. Genes undergoing purifying selection show slower-than-average substitution rates, while genes

Mammary gland development in humans during puberty involves the enlargement of breast tissue, but this is not true in non-human primates. To identify potential causes of this difference, I examined variation in substitution rates across genes related to mammary development. Genes undergoing purifying selection show slower-than-average substitution rates, while genes undergoing positive selection show faster rates. These may be related to the difference between humans and other primates. Three genes were found to be accelerated were FOXF1, IGFBP5, and ATP2B2, but only the latter one was found in humans and it seems unlikely that it would be related to the differences between mammary gland development at puberty between humans and non-human primates.
ContributorsArroyo, Diana (Author) / Cartwright, Reed (Thesis director) / Wilson Sayres, Melissa (Committee member) / Schwartz, Rachel (Committee member) / School of Life Sciences (Contributor) / Barrett, The Honors College (Contributor)
Created2016-05
134770-Thumbnail Image.png
Description
Disturbances in the protein interactome often play a large role in cancer progression. Investigation of protein-protein interactions (PPI) can increase our understanding of cancer pathways and will disclose unknown targets involved in cancer disease biology. Although numerous methods are available to study protein interactions, most platforms suffer from drawbacks including

Disturbances in the protein interactome often play a large role in cancer progression. Investigation of protein-protein interactions (PPI) can increase our understanding of cancer pathways and will disclose unknown targets involved in cancer disease biology. Although numerous methods are available to study protein interactions, most platforms suffer from drawbacks including high false positive rates, low throughput, and lack of quantification. Moreover, most methods are not compatible for use in a clinical setting. To address these limitations, we have developed a multiplexed, in-solution protein microarray (MISPA) platform with broad applications in proteomics. MISPA can be used to quantitatively profile PPIs and as a robust technology for early detection of cancers. This method utilizes unique DNA barcoding of individual proteins coupled with next generation sequencing to quantitatively assess interactions via barcode enrichment. We have tested the feasibility of this technology in the detection of patient immune responses to oropharyngeal carcinomas and in the discovery of novel PPIs in the B-cell receptor (BCR) pathway. To achieve this goal, 96 human papillomavirus (HPV) antigen genes were cloned into pJFT7-cHalo (99% success) and pJFT7-n3xFlag-Halo (100% success) expression vectors. These libraries were expressed via a cell-free in vitro transcription-translation system with 93% and 96% success, respectively. A small-scale study of patient serum interactions with barcoded HPV16 antigens was performed and a HPV proteome-wide study will follow using additional patient samples. In addition, 15 query proteins were cloned into pJFT7_nGST expression vectors, expressed, and purified with 93% success to probe a library of 100 BCR pathway proteins and detect novel PPIs.
ContributorsRinaldi, Capria Lakshmi (Author) / LaBaer, Joshua (Thesis director) / Mangone, Marco (Committee member) / Borges, Chad (Committee member) / School of Molecular Sciences (Contributor) / Barrett, The Honors College (Contributor)
Created2016-12
168413-Thumbnail Image.png
Description
Microfluidic platforms have been exploited extensively as a tool for the separation of particles by electric field manipulation. Microfluidic devices can facilitate the manipulation of particles by dielectrophoresis. Separation of particles by size and type has been demonstrated by insulator-based dielectrophoresis in a microfluidic device. Thus, manipulating particles by size

Microfluidic platforms have been exploited extensively as a tool for the separation of particles by electric field manipulation. Microfluidic devices can facilitate the manipulation of particles by dielectrophoresis. Separation of particles by size and type has been demonstrated by insulator-based dielectrophoresis in a microfluidic device. Thus, manipulating particles by size has been widely studied throughout the years. It has been shown that size-heterogeneity in organelles has been linked to multiple diseases from abnormal organelle size. Here, a mixture of two sizes of polystyrene beads (0.28 and 0.87 μm) was separated by a ratchet migration mechanism under a continuous flow (20 nL/min). Furthermore, to achieve high-throughput separation, different ratchet devices were designed to achieve high-volume separation. Recently, enormous efforts have been made to manipulate small size DNA and proteins. Here, a microfluidic device comprising of multiple valves acting as insulating constrictions when a potential is applied is presented. The tunability of the electric field gradient is evaluated by a COMSOL model, indicating that high electric field gradients can be reached by deflecting the valve at a certain distance. Experimentally, the tunability of the dynamic constriction was demonstrated by conducting a pressure study to estimate the gap distance between the valve and the substrate at different applied pressures. Finally, as a proof of principle, 0.87 μm polystyrene beads were manipulated by dielectrophoresis. These microfluidic platforms will aid in the understanding of size-heterogeneity of organelles for biomolecular assessment and achieve separation of nanometer-size DNA and proteins by dielectrophoresis.
ContributorsOrtiz, Ricardo (Author) / Ros, Alexandra (Thesis advisor) / Hayes, Mark (Committee member) / Borges, Chad (Committee member) / Arizona State University (Publisher)
Created2021
Description

In cold chain tracking systems, accuracy and flexibility across different temperatures ranges plays an integral role in monitoring biospecimen integrity. However, while two common cold chain tracking systems are currently available (electronic and physics/chemical), there is not an affordable cold chain tracking mechanism that can be applied to a variety

In cold chain tracking systems, accuracy and flexibility across different temperatures ranges plays an integral role in monitoring biospecimen integrity. However, while two common cold chain tracking systems are currently available (electronic and physics/chemical), there is not an affordable cold chain tracking mechanism that can be applied to a variety of temperatures while maintaining accuracy for individual vials. Hence, our lab implemented our understanding of biochemical reaction kinetics to develop a new cold chain tracking mechanism using the permanganate/oxalic acid reaction. The permanganate/oxalic acid reaction is characterized by the reduction of permanganate (MnVII) to Mn(II) with Mn(II)-autocatalyzed oxidation of oxalate to CO2, resulting in a pink to colorless visual indicator change when the reaction system is not in the solid state (i.e., frozen or vitrified). Throughout our research, we demonstrate, (i) Improved reaction consistency and accuracy along with extended run times with the implementation of a nitric acid-based labware washing protocol, (ii) Simulated reaction kinetics for the maximum length reaction and 60-minute reaction based on previously developed MATLAB scripts (iii) Experimental reaction kinetics to verify the simulated MATLAB maximum and 60-minute reactions times (iv) Long-term stability of the permanganate/oxalic acid reaction with water or eutectic solutions of sodium perchlorate and magnesium perchlorate at -80°C (v) Reaction kinetics with eutectic solvents, sodium perchlorate and magnesium perchlorate, at 25°C, 4°C, and -8°C (vi) Accelerated reaction kinetics after the addition of varying concentrations of manganese perchlorate (vii) Reaction kinetics of higher concentration reaction systems (5x and 10x; for darker colors), at 25°C (viii) Long-term stability of the 10x higher concentration reaction at -80°C.

ContributorsLjungberg, Emil (Author) / Borges, Chad (Thesis director) / Levitus, Marcia (Committee member) / Williams, Peter (Committee member) / Barrett, The Honors College (Contributor) / School of Molecular Sciences (Contributor) / Department of Psychology (Contributor)
Created2022-12
171582-Thumbnail Image.png
Description
High throughput transcriptome data analysis like Single-cell Ribonucleic Acid sequencing (scRNA-seq) and Circular Ribonucleic Acid (circRNA) data have made significant breakthroughs, especially in cancer genomics. Analysis of transcriptome time series data is core in identifying time point(s) where drastic changes in gene transcription are associated with homeostatic to non-homeostatic cellular

High throughput transcriptome data analysis like Single-cell Ribonucleic Acid sequencing (scRNA-seq) and Circular Ribonucleic Acid (circRNA) data have made significant breakthroughs, especially in cancer genomics. Analysis of transcriptome time series data is core in identifying time point(s) where drastic changes in gene transcription are associated with homeostatic to non-homeostatic cellular transition (tipping points). In Chapter 2 of this dissertation, I present a novel cell-type specific and co-expression-based tipping point detection method to identify target gene (TG) versus transcription factor (TF) pairs whose differential co-expression across time points drive biological changes in different cell types and the time point when these changes are observed. This method was applied to scRNA-seq data sets from a SARS-CoV-2 study (18 time points), a human cerebellum development study (9 time points), and a lung injury study (18 time points). Similarly, leveraging transcriptome data across treatment time points, I developed methodologies to identify treatment-induced and cell-type specific differentially co-expressed pairs (DCEPs). In part one of Chapter 3, I presented a pipeline that used a series of statistical tests to detect DCEPs. This method was applied to scRNA-seq data of patients with non-small cell lung cancer (NSCLC) sequenced across cancer treatment times. However, this pipeline does not account for correlations among multiple single cells from the same sample and correlations among multiple samples from the same patient. In Part 2 of Chapter 3, I presented a solution to this problem using a mixed-effect model. In Chapter 4, I present a summary of my work that focused on the cross-species analysis of circRNA transcriptome time series data. I compared circRNA profiles in neonatal pig and mouse hearts, identified orthologous circRNAs, and discussed regulation mechanisms of cardiomyocyte proliferation and myocardial regeneration conserved between mouse and pig at different time points.
ContributorsNyarige, Verah Mocheche (Author) / Liu, Li (Thesis advisor) / Wang, Junwen (Thesis advisor) / Dinu, Valentin (Committee member) / Arizona State University (Publisher)
Created2022
171514-Thumbnail Image.png
Description
Plasma and serum are the most commonly used liquid biospecimens in biomarker research. These samples may be subjected to several pre-analytical variables (PAVs) during collection, processing and storage. Exposure to thawed conditions (temperatures above -30 °C) is a PAV that is hard to control, and track and could provide misleading

Plasma and serum are the most commonly used liquid biospecimens in biomarker research. These samples may be subjected to several pre-analytical variables (PAVs) during collection, processing and storage. Exposure to thawed conditions (temperatures above -30 °C) is a PAV that is hard to control, and track and could provide misleading information, that fail to accurately reveal the in vivo biological reality, when unaccounted for. Hence, assays that can empirically check the integrity of plasma and serum samples are crucial. As a solution to this issue, an assay titled ΔS-Cys-Albumin was developed and validated. The reference range of ΔS-Cys-Albumin in cardio vascular patients was determined and the change in ΔS-Cys-Albumin values in different samples over time course incubations at room temperature, 4 °C and -20 °C were evaluated. In blind challenges, this assay proved to be successful in identifying improperly stored samples individually and as groups. Then, the correlation between the instability of several clinically important proteins in plasma from healthy and cancer patients at room temperature, 4 °C and -20 °C was assessed. Results showed a linear inverse relationship between the percentage of proteins destabilized and ΔS-Cys-Albumin regardless of the specific time or temperature of exposure, proving ΔS-Cys-Albumin as an effective surrogate marker to track the stability of clinically relevant analytes in plasma. The stability of oxidized LDL in serum at different temperatures was assessed in serum samples and it stayed stable at all temperatures evaluated. The ΔS-Cys-Albumin requires the use of an LC-ESI-MS instrument which limits its availability to most clinical research laboratories. To overcome this hurdle, an absorbance-based assay that can be measured using a plate reader was developed as an alternative to the ΔS-Cys-Albumin assay. Assay development and analytical validation procedures are reported herein. After that, the range of absorbance in plasma and serum from control and cancer patients were determined and the change in absorbance over a time course incubation at room temperature, 4 °C and -20 °C was assessed. The results showed that the absorbance assay would act as a good alternative to the ΔS-Cys-Albumin assay.
ContributorsJehanathan, Nilojan (Author) / Borges, Chad (Thesis advisor) / Guo, Jia (Committee member) / Van Horn, Wade (Committee member) / Arizona State University (Publisher)
Created2022