Matching Items (12)
Filtering by

Clear all filters

158849-Thumbnail Image.png
Description
Next-generation sequencing is a powerful tool for detecting genetic variation. How-ever, it is also error-prone, with error rates that are much larger than mutation rates.
This can make mutation detection difficult; and while increasing sequencing depth
can often help, sequence-specific errors and other non-random biases cannot be de-
tected by increased depth. The

Next-generation sequencing is a powerful tool for detecting genetic variation. How-ever, it is also error-prone, with error rates that are much larger than mutation rates.
This can make mutation detection difficult; and while increasing sequencing depth
can often help, sequence-specific errors and other non-random biases cannot be de-
tected by increased depth. The problem of accurate genotyping is exacerbated when
there is not a reference genome or other auxiliary information available.
I explore several methods for sensitively detecting mutations in non-model or-
ganisms using an example Eucalyptus melliodora individual. I use the structure of
the tree to find bounds on its somatic mutation rate and evaluate several algorithms
for variant calling. I find that conventional methods are suitable if the genome of a
close relative can be adapted to the study organism. However, with structured data,
a likelihood framework that is aware of this structure is more accurate. I use the
techniques developed here to evaluate a reference-free variant calling algorithm.
I also use this data to evaluate a k-mer based base quality score recalibrator
(KBBQ), a tool I developed to recalibrate base quality scores attached to sequencing
data. Base quality scores can help detect errors in sequencing reads, but are often
inaccurate. The most popular method for correcting this issue requires a known
set of variant sites, which is unavailable in most cases. I simulate data and show
that errors in this set of variant sites can cause calibration errors. I then show that
KBBQ accurately recalibrates base quality scores while requiring no reference or other
information and performs as well as other methods.
Finally, I use the Eucalyptus data to investigate the impact of quality score calibra-
tion on the quality of output variant calls and show that improved base quality score
calibration increases the sensitivity and reduces the false positive rate of a variant
calling algorithm.
ContributorsOrr, Adam James (Author) / Cartwright, Reed (Thesis advisor) / Wilson, Melissa (Committee member) / Kusumi, Kenro (Committee member) / Taylor, Jesse (Committee member) / Pfeifer, Susanne (Committee member) / Arizona State University (Publisher)
Created2020
Description
Wound healing is a complex tissue response that requires a coordinated interplay of multiple cells in orchestrated biological processes to restore the skin's barrier function post-injury. Proteolytic enzymes, in particular matrix metalloproteinases (MMPs), contribute to all phases of the healing process by regulating immune cell influx, clearing out the extracellular

Wound healing is a complex tissue response that requires a coordinated interplay of multiple cells in orchestrated biological processes to restore the skin's barrier function post-injury. Proteolytic enzymes, in particular matrix metalloproteinases (MMPs), contribute to all phases of the healing process by regulating immune cell influx, clearing out the extracellular matrix (ECM), and remodeling scar tissue. As a result of these various functions in the healing of skin wounds, uncontrolled activities of MMPs are associated with impaired wound healing. The MMP gene family consists of a highly conserved set of genes. Deleterious mutations in MMP genes cause developmental phenotypes that affect the heart, skeleton, and immune system response. The availability of contiguous draft genomes of non-model organisms enables the study of gene families through analysis of synteny and sequence identity. My project is aimed at conducting a comparative genomic analysis of the MMP gene family from the genomes of 29 tetrapod species—with an emphasis on reptiles. Results regarding the similarities and differences among MMP protein sequences can be further investigated to shed light on the causes which give rise to various adaptive mutations for specific species groups.
ContributorsYu, Alexander (Author) / Kusumi, Kenro (Thesis director) / Dolby, Greer (Committee member) / Barrett, The Honors College (Contributor) / School of Life Sciences (Contributor)
Created2022-12