Filtering by
- All Subjects: Biology
- All Subjects: microRNA
- Genre: Academic theses
- Creators: Kusumi, Kenro
- Member of: ASU Electronic Theses and Dissertations
- Resource Type: Text
In order to address this, multiple comparative genomics and bioinformatics analyses were conducted to elucidate patterns of evolution in the green anole and across multiple anole species. Comparative genomics analyses were used to infer additional X-linked loci in the green anole, RNAseq data from male and female samples were anayzed to quantify patterns of sex-biased gene expression across the genome, and the extent of dosage compensation on the anole X chromosome was characterized, providing evidence that the sex chromosomes in the green anole are dosage compensated.
In addition, X-linked genes have a lower ratio of nonsynonymous to synonymous substitution rates than the autosomes when compared to other Anolis species, and pairwise rates of evolution in genes across the anole genome were analyzed. To conduct this analysis a new pipeline was created for filtering alignments and performing batch calculations for whole genome coding sequences. This pipeline has been made publicly available.
I hypothesize that duplication events grant miRNA families with enhanced regulatory capabilities, specifically through distinct targeting preferences by family members. This has relevance for our understanding of vertebrate evolution, as well disease detection and personalized medicine. To test this hypothesis, I apply a conjunction of bioinformatic and experimental approaches, and design a novel high-throughput screening platform to identify human miRNA targets. Combined with conventional approaches, this tool allows systematic testing for functional targets of human miRNAs, and the identification of novel target genes on an unprecedented scale.
In this dissertation, I explore evolutionary signatures of 62 deeply conserved metazoan miRNA families, as well as the targeting preferences for several human miRNAs. I find that constraints on miRNA processing impact sequence evolution, creating evolutionary hotspots within families that guide distinct target preferences. I apply our novel screening platform to two cancer-relevant miRNAs, and identify hundreds of previously undescribed targets. I also analyze critical features of functional miRNA target sites, finding that each miRNA recognizes surprisingly distinct features of targets. To further explore the functional distinction between family members, I analyze miRNA expression patterns in multiple contexts, including mouse embryogenesis, RNA-seq data from human tissues, and cancer cell lines. Together, my results inform a model that describes the evolution of metazoan miRNAs, and suggests that highly similar miRNA family members possess distinct functions. These findings broaden our understanding of miRNA function in vertebrate evolution and development, and how their misexpression contributes to human disease.
This can make mutation detection difficult; and while increasing sequencing depth
can often help, sequence-specific errors and other non-random biases cannot be de-
tected by increased depth. The problem of accurate genotyping is exacerbated when
there is not a reference genome or other auxiliary information available.
I explore several methods for sensitively detecting mutations in non-model or-
ganisms using an example Eucalyptus melliodora individual. I use the structure of
the tree to find bounds on its somatic mutation rate and evaluate several algorithms
for variant calling. I find that conventional methods are suitable if the genome of a
close relative can be adapted to the study organism. However, with structured data,
a likelihood framework that is aware of this structure is more accurate. I use the
techniques developed here to evaluate a reference-free variant calling algorithm.
I also use this data to evaluate a k-mer based base quality score recalibrator
(KBBQ), a tool I developed to recalibrate base quality scores attached to sequencing
data. Base quality scores can help detect errors in sequencing reads, but are often
inaccurate. The most popular method for correcting this issue requires a known
set of variant sites, which is unavailable in most cases. I simulate data and show
that errors in this set of variant sites can cause calibration errors. I then show that
KBBQ accurately recalibrates base quality scores while requiring no reference or other
information and performs as well as other methods.
Finally, I use the Eucalyptus data to investigate the impact of quality score calibra-
tion on the quality of output variant calls and show that improved base quality score
calibration increases the sensitivity and reduces the false positive rate of a variant
calling algorithm.
Structural Equation Modeling (SEM) is a multivariate analysis methodology that could potentially be utilized to examine the barrier effect that river systems have on genetic differentiation. In this project, river systems are split into the variables of Daily Average Discharge, Average River Width, and Seasonality measurements and regressed onto the genetic differentiation, measured as Fst. This data was collected from the USGS database (U.S. Geological Survey, 2020), sequencing files from differing literature, or Google Earth measurements. Different Structural Equation Modeling models are used to model different system structures as well as compare it to more traditional methodologies like Generalized Linear Modeling and Generalized Linear Mixed Modeling. Ultimately results were limited by the small sample size, however, interesting patterns still emerged from the models. The SE models indicate that Discharge plays a primary role in the genetic differentiation of adjacent river populations. In addition to this, the results demonstrate how quantification of indirect effects, particularly those relating to discharge, give more informative interpretations than traditional multivariate statistics alone. These findings prompt further investigations into this potential methodology.