Search Content

Structural variant detection: a novel approach

Description

Genomic structural variation (SV) is defined as gross alterations in the genome broadly classified as insertions/duplications, deletions inversions and translocations. DNA sequencing ushered structural variant discovery beyond laboratory detection techniques to high resolution informatics approaches. Bioinformatics tools for computational discovery of SVs however are still missing variants in the complex…

Genomic structural variation (SV) is defined as gross alterations in the genome broadly classified as insertions/duplications, deletions inversions and translocations. DNA sequencing ushered structural variant discovery beyond laboratory detection techniques to high resolution informatics approaches. Bioinformatics tools for computational discovery of SVs however are still missing variants in the complex cancer genome. This study aimed to define genomic context leading to tool failure and design novel algorithm addressing this context. Methods: The study tested the widely held but unproven hypothesis that tools fail to detect variants which lie in repeat regions. Publicly available 1000-Genomes dataset with experimentally validated variants was tested with SVDetect-tool for presence of true positives (TP) SVs versus false negative (FN) SVs, expecting that FNs would be overrepresented in repeat regions. Further, the novel algorithm designed to informatically capture the biological etiology of translocations (non-allelic homologous recombination and 3&ndashD; placement of chromosomes in cells –context) was tested using simulated dataset. Translocations were created in known translocation hotspots and the novel&ndashalgorithm; tool compared with SVDetect and BreakDancer. Results: 53% of false negative (FN) deletions were within repeat structure compared to 81% true positive (TP) deletions. Similarly, 33% FN insertions versus 42% TP, 26% FN duplication versus 57% TP and 54% FN novel sequences versus 62% TP were within repeats. Repeat structure was not driving the tool's inability to detect variants and could not be used as context. The novel algorithm with a redefined context, when tested against SVDetect and BreakDancer was able to detect 10/10 simulated translocations with 30X coverage dataset and 100% allele frequency, while SVDetect captured 4/10 and BreakDancer detected 6/10. For 15X coverage dataset with 100% allele frequency, novel algorithm was able to detect all ten translocations albeit with fewer reads supporting the same. BreakDancer detected 4/10 and SVDetect detected 2/10 Conclusion: This study showed that presence of repetitive elements in general within a structural variant did not influence the tool's ability to capture it. This context-based algorithm proved better than current tools even with half the genome coverage than accepted protocol and provides an important first step for novel translocation discovery in cancer genome.

ContributorsShetty, Sheetal (Author) / Dinu, Valentin (Thesis advisor) / Bussey, Kimberly (Committee member) / Scotch, Matthew (Committee member) / Wallstrom, Garrick (Committee member) / Arizona State University (Publisher)

Created2014

Informatics approaches for integrative analysis of disparate high-throughput genomic datasets in cancer

Description

The processes of a human somatic cell are very complex with various genetic mechanisms governing its fate. Such cells undergo various genetic mutations, which translate to the genetic aberrations that we see in cancer. There are more than 100 types of cancer, each having many more subtypes with aberrations being…

The processes of a human somatic cell are very complex with various genetic mechanisms governing its fate. Such cells undergo various genetic mutations, which translate to the genetic aberrations that we see in cancer. There are more than 100 types of cancer, each having many more subtypes with aberrations being unique to each. In the past two decades, the widespread application of high-throughput genomic technologies, such as micro-arrays and next-generation sequencing, has led to the revelation of many such aberrations. Known types and subtypes can be readily identified using gene-expression profiling and more importantly, high-throughput genomic datasets have helped identify novel sub-types with distinct signatures. Recent studies showing usage of gene-expression profiling in clinical decision making in breast cancer patients underscore the utility of high-throughput datasets. Beyond prognosis, understanding the underlying cellular processes is essential for effective cancer treatment. Various high-throughput techniques are now available to look at a particular aspect of a genetic mechanism in cancer tissue. To look at these mechanisms individually is akin to looking at a broken watch; taking apart each of its parts, looking at them individually and finally making a list of all the faulty ones. Integrative approaches are needed to transform one-dimensional cancer signatures into multi-dimensional interaction and regulatory networks, consequently bettering our understanding of cellular processes in cancer. Here, I attempt to (i) address ways to effectively identify high quality variants when multiple assays on the same sample samples are available through two novel tools, snpSniffer and NGSPE; (ii) glean new biological insight into multiple myeloma through two novel integrative analysis approaches making use of disparate high-throughput datasets. While these methods focus on multiple myeloma datasets, the informatics approaches are applicable to all cancer datasets and will thus help advance cancer genomics.

ContributorsYellapantula, Venkata (Author) / Dinu, Valentin (Thesis advisor) / Scotch, Matthew (Committee member) / Wallstrom, Garrick (Committee member) / Keats, Jonathan (Committee member) / Arizona State University (Publisher)

Created2014

Integrative analysis of genomic aberrations in cancer and xenograft Models

Description

No two cancers are alike. Cancer is a dynamic and heterogeneous disease, such heterogeneity arise among patients with the same cancer type, among cancer cells within the same individual’s tumor and even among cells within the same sub-clone over time. The recent application of next-generation sequencing and precision medicine techniques…

No two cancers are alike. Cancer is a dynamic and heterogeneous disease, such heterogeneity arise among patients with the same cancer type, among cancer cells within the same individual’s tumor and even among cells within the same sub-clone over time. The recent application of next-generation sequencing and precision medicine techniques is the driving force to uncover the complexity of cancer and the best clinical practice. The core concept of precision medicine is to move away from crowd-based, best-for-most treatment and take individual variability into account when optimizing the prevention and treatment strategies. Next-generation sequencing is the method to sift through the entire 3 billion letters of each patient’s DNA genetic code in a massively parallel fashion.

The deluge of next-generation sequencing data nowadays has shifted the bottleneck of cancer research from multiple “-omics” data collection to integrative analysis and data interpretation. In this dissertation, I attempt to address two distinct, but dependent, challenges. The first is to design specific computational algorithms and tools that can process and extract useful information from the raw data in an efficient, robust, and reproducible manner. The second challenge is to develop high-level computational methods and data frameworks for integrating and interpreting these data. Specifically, Chapter 2 presents a tool called Snipea (SNv Integration, Prioritization, Ensemble, and Annotation) to further identify, prioritize and annotate somatic SNVs (Single Nucleotide Variant) called from multiple variant callers. Chapter 3 describes a novel alignment-based algorithm to accurately and losslessly classify sequencing reads from xenograft models. Chapter 4 describes a direct and biologically motivated framework and associated methods for identification of putative aberrations causing survival difference in GBM patients by integrating whole-genome sequencing, exome sequencing, RNA-Sequencing, methylation array and clinical data. Lastly, chapter 5 explores longitudinal and intratumor heterogeneity studies to reveal the temporal and spatial context of tumor evolution. The long-term goal is to help patients with cancer, particularly those who are in front of us today. Genome-based analysis of the patient tumor can identify genomic alterations unique to each patient’s tumor that are candidate therapeutic targets to decrease therapy resistance and improve clinical outcome.

ContributorsPeng, Sen (Author) / Dinu, Valentin (Thesis advisor) / Scotch, Matthew (Committee member) / Wallstrom, Garrick (Committee member) / Arizona State University (Publisher)

Created2015

Validation and Characterization of Novel FCHSD2 Translocations Identified in Multiple Myeloma

Description

Multiple myeloma is a genetically heterogeneous disease, which can be divided into several genetic subtypes based upon gene expression profiles and chromosomal abnormalities. Unlike older techniques employed in myeloma research, such as cytogenetics, FISH, and microarray technologies, RNA sequencing offers a unique approach to examine the aforementioned genetic characteristics in…

Multiple myeloma is a genetically heterogeneous disease, which can be divided into several genetic subtypes based upon gene expression profiles and chromosomal abnormalities. Unlike older techniques employed in myeloma research, such as cytogenetics, FISH, and microarray technologies, RNA sequencing offers a unique approach to examine the aforementioned genetic characteristics in that it allows for gene expression profiling and the detection of novel fusion transcripts arising from chromosomal rearrangements. This study utilized RNA sequencing to analyze the transcriptomes of 84 multiple myeloma patients and 69 human myeloma cell lines. FCHSD2 was found to be involved in five novel fusion events along with known oncogenes, MMSET and MYC, as well as three previously unreported genes in myeloma, including CHMP4B, NCF2, and CARNS1. An analysis of FCHSD2 expression within myeloma cell lines indicated that it is highly expressed in comparison to other tissues, suggesting that FCHSD2 translocations could lead to promoter replacement events in which the expression of partnering genes is dysregulated. The presence of the five FCHSD2 hybrid transcripts was confirmed by reverse transcription-PCR and Sanger sequencing. Overexpression of the FCHSD2 fusion transcripts in HEK293 cells resulted in the production of N-terminally truncated fusion partner proteins and a novel FCHSD2-CARNS1 fusion protein.

ContributorsMurray, Christopher William (Author) / Wilson-Rawls, Jeanne (Thesis director) / Carpten, John (Committee member) / Keats, Jonathan (Committee member) / Barrett, The Honors College (Contributor) / School of Life Sciences (Contributor)

Created2014-05

Identification of Tumor Associated Antigens using Nucleic Acid Programmable Protein Arrays

Description

Identifying disease biomarkers may aid in the early detection of breast cancer and improve patient outcomes. Recent evidence suggests that tumors are immunogenic and therefore patients may launch an autoantibody response to tumor associated antigens. Single-chain variable fragments of autoantibodies derived from regional lymph node B cells of breast cancer…

Identifying disease biomarkers may aid in the early detection of breast cancer and improve patient outcomes. Recent evidence suggests that tumors are immunogenic and therefore patients may launch an autoantibody response to tumor associated antigens. Single-chain variable fragments of autoantibodies derived from regional lymph node B cells of breast cancer patients were used to discover these tumor associated biomarkers on protein microarrays. Six candidate biomarkers were discovered from 22 heavy chain-only variable region antibody fragments screened. Validation tests are necessary to confirm the tumorgenicity of these antigens. However, the use of single-chain variable autoantibody fragments presents a novel platform for diagnostics and cancer therapeutics.

ContributorsSharman, M. Camila (Author) / Magee, Dewey (Mitch) (Thesis director) / Wallstrom, Garrick (Committee member) / Petritis, Brianne (Committee member) / Barrett, The Honors College (Contributor) / College of Liberal Arts and Sciences (Contributor) / Virginia G. Piper Center for Personalized Diagnostics (Contributor) / Biodesign Institute (Contributor)

Created2012-12

Reduced Incidence of Prevotella and Other Fermenters in Intestinal Microflora of Autistic Children

Description

High proportions of autistic children suffer from gastrointestinal (GI) disorders, implying a link between autism and abnormalities in gut microbial functions. Increasing evidence from recent high-throughput sequencing analyses indicates that disturbances in composition and diversity of gut microbiome are associated with various disease conditions. However, microbiome-level studies on autism are…

High proportions of autistic children suffer from gastrointestinal (GI) disorders, implying a link between autism and abnormalities in gut microbial functions. Increasing evidence from recent high-throughput sequencing analyses indicates that disturbances in composition and diversity of gut microbiome are associated with various disease conditions. However, microbiome-level studies on autism are limited and mostly focused on pathogenic bacteria. Therefore, here we aimed to define systemic changes in gut microbiome associated with autism and autism-related GI problems. We recruited 20 neurotypical and 20 autistic children accompanied by a survey of both autistic severity and GI symptoms. By pyrosequencing the V2/V3 regions in bacterial 16S rDNA from fecal DNA samples, we compared gut microbiomes of GI symptom-free neurotypical children with those of autistic children mostly presenting GI symptoms. Unexpectedly, the presence of autistic symptoms, rather than the severity of GI symptoms, was associated with less diverse gut microbiomes. Further, rigorous statistical tests with multiple testing corrections showed significantly lower abundances of the genera Prevotella, Coprococcus, and unclassified Veillonellaceae in autistic samples. These are intriguingly versatile carbohydrate-degrading and/or fermenting bacteria, suggesting a potential influence of unusual diet patterns observed in autistic children. However, multivariate analyses showed that autism-related changes in both overall diversity and individual genus abundances were correlated with the presence of autistic symptoms but not with their diet patterns. Taken together, autism and accompanying GI symptoms were characterized by distinct and less diverse gut microbial compositions with lower levels of Prevotella, Coprococcus, and unclassified Veillonellaceae.

ContributorsKang, Dae Wook (Author) / Park, Jin (Author) / Ilhan, Zehra (Author) / Wallstrom, Garrick (Author) / LaBaer, Joshua (Author) / Adams, James (Author) / Krajmalnik-Brown, Rosa (Author) / Biodesign Institute (Contributor)

Created2013-06-03

Quantifying Antibody Binding on Protein Microarrays Using Microarray Nonlinear Calibration

Description

We present a microarray nonlinear calibration (MiNC) method for quantifying antibody binding to the surface of protein microarrays that significantly increases the linear dynamic range and reduces assay variation compared with traditional approaches. A serological analysis of guinea pig Mycobacterium tuberculosis models showed that a larger number of putative antigen…

We present a microarray nonlinear calibration (MiNC) method for quantifying antibody binding to the surface of protein microarrays that significantly increases the linear dynamic range and reduces assay variation compared with traditional approaches. A serological analysis of guinea pig Mycobacterium tuberculosis models showed that a larger number of putative antigen targets were identified with MiNC, which is consistent with the improved assay performance of protein microarrays. MiNC has the potential to be employed in biomedical research using multiplex antibody assays that need quantitation, including the discovery of antibody biomarkers, clinical diagnostics with multi-antibody signatures, and construction of immune mathematical models.

ContributorsYu, Xiaobo (Author) / Wallstrom, Garrick (Author) / Magee, Mitch (Author) / Qiu, Ji (Author) / Mendoza, D. Eliseo A. (Author) / Wang, Jie (Author) / Bian, Xiaofang (Author) / Graves, Morgan (Author) / LaBaer, Joshua (Author) / Biodesign Institute (Contributor)

Created2013-08-12

Loss of the Tumor Suppressor SMARCA4 in Small Cell Carcinoma of the Ovary, Hypercalcemic Type (SCCOHT)

Description

Small cell carcinoma of the ovary, hypercalcemic type (SCCOHT), is a rare and understudied cancer with a dismal prognosis. SCCOHT's infrequency has hindered empirical study of its biology and clinical management. However, we and others have recently identified inactivating mutations in the SWI/SNF chromatin remodeling gene SMARCA4 with concomitant loss…

Small cell carcinoma of the ovary, hypercalcemic type (SCCOHT), is a rare and understudied cancer with a dismal prognosis. SCCOHT's infrequency has hindered empirical study of its biology and clinical management. However, we and others have recently identified inactivating mutations in the SWI/SNF chromatin remodeling gene SMARCA4 with concomitant loss of SMARCA4 protein in the majority of SCCOHT tumors. Here we summarize these findings and report SMARCA4 status by targeted sequencing and/or immunohistochemistry (IHC) in an additional 12 SCCOHT tumors, 3 matched germlines, and the cell line SCCOHT-1. We also report the identification of a homozygous inactivating mutation in the gene SMARCB1 in one SCCOHT tumor with wild-type SMARCA4, suggesting that SMARCB1 inactivation may also play a role in the pathogenesis of SCCOHT. To date, SMARCA4 mutations and protein loss have been reported in the majority of 69 SCCOHT cases (including 2 cell lines). These data firmly establish SMARCA4 as a tumor suppressor whose loss promotes the development of SCCOHT, setting the stage for rapid advancement in the biological understanding, diagnosis, and treatment of this rare tumor type.

ContributorsRamos, Pilar (Author) / Kamezis, Anthony N. (Author) / Hendricks, William P. D. (Author) / Wang, Yemin (Author) / Tembe, Waibhav (Author) / Zismann, Victoria L. (Author) / Legendre, Christophe (Author) / Liang, Winnie S. (Author) / Russell, Megan L. (Author) / Craig, David W. (Author) / Farley, John H. (Author) / Monk, Bradley J. (Author) / Anthony, Stephen P. (Author) / Sekulic, Aleksandar (Author) / Cunliffe, Heather E. (Author) / Huntsman, David G. (Author) / Trent, Jeffrey M. (Author) / College of Liberal Arts and Sciences (Contributor)

Created2014-11-03

Genome-Wide Characterization of Pancreatic Adenocarcinoma Patients Using Next Generation Sequencing

Description

Pancreatic adenocarcinoma (PAC) is among the most lethal malignancies. While research has implicated multiple genes in disease pathogenesis, identification of therapeutic leads has been difficult and the majority of currently available therapies provide only marginal benefit. To address this issue, our goal was to genomically characterize individual PAC patients to…

Pancreatic adenocarcinoma (PAC) is among the most lethal malignancies. While research has implicated multiple genes in disease pathogenesis, identification of therapeutic leads has been difficult and the majority of currently available therapies provide only marginal benefit. To address this issue, our goal was to genomically characterize individual PAC patients to understand the range of aberrations that are occurring in each tumor. Because our understanding of PAC tumorigenesis is limited, evaluation of separate cases may reveal aberrations, that are less common but may provide relevant information on the disease, or that may represent viable therapeutic targets for the patient. We used next generation sequencing to assess global somatic events across 3 PAC patients to characterize each patient and to identify potential targets. This study is the first to report whole genome sequencing (WGS) findings in paired tumor/normal samples collected from 3 separate PAC patients. We generated on average 132 billion mappable bases across all patients using WGS, and identified 142 somatic coding events including point mutations, insertion/deletions, and chromosomal copy number variants. We did not identify any significant somatic translocation events. We also performed RNA sequencing on 2 of these patients' tumors for which tumor RNA was available to evaluate expression changes that may be associated with somatic events, and generated over 100 million mapped reads for each patient. We further performed pathway analysis of all sequencing data to identify processes that may be the most heavily impacted from somatic and expression alterations. As expected, the KRAS signaling pathway was the most heavily impacted pathway (P<0.05), along with tumor-stroma interactions and tumor suppressive pathways. While sequencing of more patients is needed, the high resolution genomic and transcriptomic information we have acquired here provides valuable information on the molecular composition of PAC and helps to establish a foundation for improved therapeutic selection.

ContributorsLiang, Winnie S. (Author) / Craig, David W. (Author) / Carpten, John (Author) / Borad, Mitesh J. (Author) / Demeure, Michael J. (Author) / Weiss, Glen J. (Author) / Izatt, Tyler (Author) / Sinari, Shripad (Author) / Christoforides, Alexis (Author) / Aldrich, Jessica (Author) / Kurdoglu, Ahmet (Author) / Barrett, Michael (Author) / Phillips, Lori (Author) / Benson, Hollie (Author) / Tembe, Waibhav (Author) / Braggio, Esteban (Author) / Kiefer, Jeffrey A. (Author) / Legendre, Christophe (Author) / Posner, Richard (Author) / Hostetter, Galen H. (Author) / Baker, Angela (Author) / Egan, Jan B. (Author) / Han, Haiyong (Author) / Lake, Douglas (Author) / Stites, Edward C. (Author) / Ramanathan, Ramesh K. (Author) / Fonseca, Rafael (Author) / Stewart, A. Keith (Author) / Von Hoff, Daniel (Author) / College of Liberal Arts and Sciences (Contributor)

Created2012-10-10

Autoantibody Signature for the Serologic Detection of Ovarian Cancer

Description

Sera from patients with ovarian cancer contain autoantibodies (AAb) to tumor-derived proteins that are potential biomarkers for early detection. To detect AAb, we probed high-density programmable protein microarrays (NAPPA) expressing 5177 candidate tumor antigens with sera from patients with serous ovarian cancer (n = 34 cases/30 controls) and measured bound…

Sera from patients with ovarian cancer contain autoantibodies (AAb) to tumor-derived proteins that are potential biomarkers for early detection. To detect AAb, we probed high-density programmable protein microarrays (NAPPA) expressing 5177 candidate tumor antigens with sera from patients with serous ovarian cancer (n = 34 cases/30 controls) and measured bound IgG. Of these, 741 antigens were selected and probed with an independent set of ovarian cancer sera (n = 60 cases/60 controls). Twelve potential autoantigens were identified with sensitivities ranging from 13 to 22% at >93% specificity. These were retested using a Luminex bead array using 60 cases and 60 controls, with sensitivities ranging from 0 to 31.7% at 95% specificity. Three AAb (p53, PTPRA, and PTGFR) had area under the curve (AUC) levels >60% (p < 0.01), with the partial AUC (SPAUC) over 5 times greater than for a nondiscriminating test (p < 0.01). Using a panel of the top three AAb (p53, PTPRA, and PTGFR), if at least two AAb were positive, then the sensitivity was 23.3% at 98.3% specificity. AAb to at least one of these top three antigens were also detected in 7/20 sera (35%) of patients with low CA 125 levels and 0/15 controls. AAb to p53, PTPRA, and PTGFR are potential biomarkers for the early detection of ovarian cancer.

ContributorsAnderson, Karen (Author) / Cramer, Daniel W. (Author) / Sibani, Sahar (Author) / Wallstrom, Garrick (Author) / Wong, Jessica (Author) / Park, Jin (Author) / Qiu, Ji (Author) / Vitonis, Allison (Author) / LaBaer, Joshua (Author) / Biodesign Institute (Contributor)

Created2015-01-01