The deluge of next-generation sequencing data nowadays has shifted the bottleneck of cancer research from multiple “-omics” data collection to integrative analysis and data interpretation. In this dissertation, I attempt to address two distinct, but dependent, challenges. The first is to design specific computational algorithms and tools that can process and extract useful information from the raw data in an efficient, robust, and reproducible manner. The second challenge is to develop high-level computational methods and data frameworks for integrating and interpreting these data. Specifically, Chapter 2 presents a tool called Snipea (SNv Integration, Prioritization, Ensemble, and Annotation) to further identify, prioritize and annotate somatic SNVs (Single Nucleotide Variant) called from multiple variant callers. Chapter 3 describes a novel alignment-based algorithm to accurately and losslessly classify sequencing reads from xenograft models. Chapter 4 describes a direct and biologically motivated framework and associated methods for identification of putative aberrations causing survival difference in GBM patients by integrating whole-genome sequencing, exome sequencing, RNA-Sequencing, methylation array and clinical data. Lastly, chapter 5 explores longitudinal and intratumor heterogeneity studies to reveal the temporal and spatial context of tumor evolution. The long-term goal is to help patients with cancer, particularly those who are in front of us today. Genome-based analysis of the patient tumor can identify genomic alterations unique to each patient’s tumor that are candidate therapeutic targets to decrease therapy resistance and improve clinical outcome.
High proportions of autistic children suffer from gastrointestinal (GI) disorders, implying a link between autism and abnormalities in gut microbial functions. Increasing evidence from recent high-throughput sequencing analyses indicates that disturbances in composition and diversity of gut microbiome are associated with various disease conditions. However, microbiome-level studies on autism are limited and mostly focused on pathogenic bacteria. Therefore, here we aimed to define systemic changes in gut microbiome associated with autism and autism-related GI problems. We recruited 20 neurotypical and 20 autistic children accompanied by a survey of both autistic severity and GI symptoms. By pyrosequencing the V2/V3 regions in bacterial 16S rDNA from fecal DNA samples, we compared gut microbiomes of GI symptom-free neurotypical children with those of autistic children mostly presenting GI symptoms. Unexpectedly, the presence of autistic symptoms, rather than the severity of GI symptoms, was associated with less diverse gut microbiomes. Further, rigorous statistical tests with multiple testing corrections showed significantly lower abundances of the genera Prevotella, Coprococcus, and unclassified Veillonellaceae in autistic samples. These are intriguingly versatile carbohydrate-degrading and/or fermenting bacteria, suggesting a potential influence of unusual diet patterns observed in autistic children. However, multivariate analyses showed that autism-related changes in both overall diversity and individual genus abundances were correlated with the presence of autistic symptoms but not with their diet patterns. Taken together, autism and accompanying GI symptoms were characterized by distinct and less diverse gut microbial compositions with lower levels of Prevotella, Coprococcus, and unclassified Veillonellaceae.
We present a microarray nonlinear calibration (MiNC) method for quantifying antibody binding to the surface of protein microarrays that significantly increases the linear dynamic range and reduces assay variation compared with traditional approaches. A serological analysis of guinea pig Mycobacterium tuberculosis models showed that a larger number of putative antigen targets were identified with MiNC, which is consistent with the improved assay performance of protein microarrays. MiNC has the potential to be employed in biomedical research using multiplex antibody assays that need quantitation, including the discovery of antibody biomarkers, clinical diagnostics with multi-antibody signatures, and construction of immune mathematical models.
Sera from patients with ovarian cancer contain autoantibodies (AAb) to tumor-derived proteins that are potential biomarkers for early detection. To detect AAb, we probed high-density programmable protein microarrays (NAPPA) expressing 5177 candidate tumor antigens with sera from patients with serous ovarian cancer (n = 34 cases/30 controls) and measured bound IgG. Of these, 741 antigens were selected and probed with an independent set of ovarian cancer sera (n = 60 cases/60 controls). Twelve potential autoantigens were identified with sensitivities ranging from 13 to 22% at >93% specificity. These were retested using a Luminex bead array using 60 cases and 60 controls, with sensitivities ranging from 0 to 31.7% at 95% specificity. Three AAb (p53, PTPRA, and PTGFR) had area under the curve (AUC) levels >60% (p < 0.01), with the partial AUC (SPAUC) over 5 times greater than for a nondiscriminating test (p < 0.01). Using a panel of the top three AAb (p53, PTPRA, and PTGFR), if at least two AAb were positive, then the sensitivity was 23.3% at 98.3% specificity. AAb to at least one of these top three antigens were also detected in 7/20 sera (35%) of patients with low CA 125 levels and 0/15 controls. AAb to p53, PTPRA, and PTGFR are potential biomarkers for the early detection of ovarian cancer.
Butyrate is a common fatty acid produced in important fermentative systems, such as the human/animal gut and other H2 production systems. Despite its importance, there is little information on the partnerships between butyrate producers and other bacteria. The objective of this work was to uncover butyrate-producing microbial communities and possible metabolic routes in a controlled fermentation system aimed at butyrate production. The butyrogenic reactor was operated at 37°C and pH 5.5 with a hydraulic retention time of 31 h and a low hydrogen partial pressure (PH2). High-throughput sequencing and metagenome functional prediction from 16S rRNA data showed that butyrate production pathways and microbial communities were different during batch (closed) and continuous-mode operation. Lactobacillaceae, Lachnospiraceae, and Enterococcaceae were the most abundant phylotypes in the closed system without PH2 control, whereas Prevotellaceae, Ruminococcaceae, and Actinomycetaceae were the most abundant phylotypes under continuous operation at low PH2. Putative butyrate producers identified in our system were from Prevotellaceae, Clostridiaceae, Ruminococcaceae, and Lactobacillaceae. Metagenome prediction analysis suggests that nonbutyrogenic microorganisms influenced butyrate production by generating butyrate precursors such as acetate, lactate, and succinate. 16S rRNA gene analysis suggested that, in the reactor, a partnership between identified butyrogenic microorganisms and succinate (i.e., Actinomycetaceae), acetate (i.e., Ruminococcaceae and Actinomycetaceae), and lactate producers (i.e., Ruminococcaceae and Lactobacillaceae) took place under continuous-flow operation at low PH2.