Search Content

Structural variant detection: a novel approach

Description

Genomic structural variation (SV) is defined as gross alterations in the genome broadly classified as insertions/duplications, deletions inversions and translocations. DNA sequencing ushered structural variant discovery beyond laboratory detection techniques to high resolution informatics approaches. Bioinformatics tools for computational discovery of SVs however are still missing variants in the complex…

Genomic structural variation (SV) is defined as gross alterations in the genome broadly classified as insertions/duplications, deletions inversions and translocations. DNA sequencing ushered structural variant discovery beyond laboratory detection techniques to high resolution informatics approaches. Bioinformatics tools for computational discovery of SVs however are still missing variants in the complex cancer genome. This study aimed to define genomic context leading to tool failure and design novel algorithm addressing this context. Methods: The study tested the widely held but unproven hypothesis that tools fail to detect variants which lie in repeat regions. Publicly available 1000-Genomes dataset with experimentally validated variants was tested with SVDetect-tool for presence of true positives (TP) SVs versus false negative (FN) SVs, expecting that FNs would be overrepresented in repeat regions. Further, the novel algorithm designed to informatically capture the biological etiology of translocations (non-allelic homologous recombination and 3&ndashD; placement of chromosomes in cells –context) was tested using simulated dataset. Translocations were created in known translocation hotspots and the novel&ndashalgorithm; tool compared with SVDetect and BreakDancer. Results: 53% of false negative (FN) deletions were within repeat structure compared to 81% true positive (TP) deletions. Similarly, 33% FN insertions versus 42% TP, 26% FN duplication versus 57% TP and 54% FN novel sequences versus 62% TP were within repeats. Repeat structure was not driving the tool's inability to detect variants and could not be used as context. The novel algorithm with a redefined context, when tested against SVDetect and BreakDancer was able to detect 10/10 simulated translocations with 30X coverage dataset and 100% allele frequency, while SVDetect captured 4/10 and BreakDancer detected 6/10. For 15X coverage dataset with 100% allele frequency, novel algorithm was able to detect all ten translocations albeit with fewer reads supporting the same. BreakDancer detected 4/10 and SVDetect detected 2/10 Conclusion: This study showed that presence of repetitive elements in general within a structural variant did not influence the tool's ability to capture it. This context-based algorithm proved better than current tools even with half the genome coverage than accepted protocol and provides an important first step for novel translocation discovery in cancer genome.

ContributorsShetty, Sheetal (Author) / Dinu, Valentin (Thesis advisor) / Bussey, Kimberly (Committee member) / Scotch, Matthew (Committee member) / Wallstrom, Garrick (Committee member) / Arizona State University (Publisher)

Created2014

Simple symphony. II. Playful pizzicato

ContributorsBritten, Benjamin, 1913-1976 (Composer)

Gene Families in Cancer: Using phylogenetic data to examine an atavistic model of cancer

Description

Despite the 40-year war on cancer, very limited progress has been made in developing a cure for the disease. This failure has prompted the reevaluation of the causes and development of cancer. One resulting model, coined the atavistic model of cancer, posits that cancer is a default phenotype of the…

Despite the 40-year war on cancer, very limited progress has been made in developing a cure for the disease. This failure has prompted the reevaluation of the causes and development of cancer. One resulting model, coined the atavistic model of cancer, posits that cancer is a default phenotype of the cells of multicellular organisms which arises when the cell is subjected to an unusual amount of stress. Since this default phenotype is similar across cell types and even organisms, it seems it must be an evolutionarily ancestral phenotype. We take a phylostratigraphical approach, but systematically add species divergence time data to estimate gene ages numerically and use these ages to investigate the ages of genes involved in cancer. We find that ancient disease-recessive cancer genes are significantly enriched for DNA repair and SOS activity, which seems to imply that a core component of cancer development is not the regulation of growth, but the regulation of mutation. Verification of this finding could drastically improve cancer treatment and prevention.

ContributorsOrr, Adam James (Author) / Davies, Paul (Thesis director) / Bussey, Kimberly (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Department of Chemistry and Biochemistry (Contributor) / School of Life Sciences (Contributor)

Created2015-05

Structural variant detection: a novel approach

Simple symphony. II. Playful pizzicato

Gene Families in Cancer: Using phylogenetic data to examine an atavistic model of cancer

A birthday hansel, op. 92

Introduction and rondo alla burlesca, op. 23, no. 1

Nocturnal, op. 70. I. Musingly

Five flower songs. To daffodils ; Marsh flowers ; Ballad of green broom

Nocturnal, op. 70

Temporal variations. VII. Waltz

Soirees musicales, op. 9 (after Rossini). March