Matching Items (28)
Filtering by

Clear all filters

131533-Thumbnail Image.png
Description
Many factors are at play within the genome of an organism, contributing to much of the diversity and variation across the tree of life. While the genome is generally encoded by four nucleotides, A, C, T, and G, this code can be expanded. One particular mechanism that we examine in

Many factors are at play within the genome of an organism, contributing to much of the diversity and variation across the tree of life. While the genome is generally encoded by four nucleotides, A, C, T, and G, this code can be expanded. One particular mechanism that we examine in this thesis is modification of bases—more specifically, methylation of Adenine (m6A) within the GATC motif of Escherichia coli. These methylated adenines are especially important in a process called methyl-directed mismatch repair (MMR), a pathway responsible for repairing errors in the DNA sequence produced by replication. In this pathway, methylated adenines identify the parent strand and direct the repair proteins to correct the erroneous base in the daughter strand. While the primary role of methylated adenines at GATC sites is to direct the MMR pathway, this methylation has also been found to affect other processes, such as gene expression, the activity of transposable elements, and the timing of DNA replication. However, in the absence of MMR, the ability of these other processes to maintain adenine methylation and its targets is unknown.
To determine if the disruption of the MMR pathway results in the reduced conservation of methylated adenines as well as an increased tolerance for mutations that result in the loss or gain of new GATC sites, we surveyed individual clones isolated from experimentally evolving wild-type and MMR-deficient (mutL- ;conferring an 150x increase in mutation rate) populations of E. coli with whole-genome sequencing. Initial analysis revealed a lack of mutations affecting methylation sites (GATC tetranucleotides) in wild-type clones. However, the inherent low mutation rates conferred by the wild-type background render this result inconclusive, due to a lack of statistical power, and reveal a need for a more direct measure of changes in methylation status. Thus as a first step to comparative methylomics, we benchmarked four different methylation-calling pipelines on three biological replicates of the wildtype progenitor strain for our evolved populations.
While it is understood that these methylated sites play a role in the MMR pathway, it is not fully understood the full extent of their effect on the genome. Thus the goal of this thesis was to better understand the forces which maintain the genome, specifically concerning m6A within the GATC motif.
ContributorsBoyer, Gwyneth (Author) / Lynch, Michael (Thesis director) / Behringer, Megan (Committee member) / Geiler-Samerotte, Kerry (Committee member) / School of Life Sciences (Contributor) / Department of Psychology (Contributor) / Barrett, The Honors College (Contributor)
Created2020-05
135584-Thumbnail Image.png
Description
Breast cancer is the leading cause of cancer-related deaths of women in the united states. Traditionally, Breast cancer is predominantly treated by a combination of surgery, chemotherapy, and radiation therapy. However, due to the significant negative side effects associated with these traditional treatments, there has been substantial efforts to develo

Breast cancer is the leading cause of cancer-related deaths of women in the united states. Traditionally, Breast cancer is predominantly treated by a combination of surgery, chemotherapy, and radiation therapy. However, due to the significant negative side effects associated with these traditional treatments, there has been substantial efforts to develop alternative therapies to treat cancer. One such alternative therapy is a peptide-based therapeutic cancer vaccine. Therapeutic cancer vaccines enhance an individual's immune response to a specific tumor. They are capable of doing this through artificial activation of tumor specific CTLs (Cytotoxic T Lymphocytes). However, in order to artificially activate tumor specific CTLs, a patient must be treated with immunogenic epitopes derived from their specific cancer type. We have identified that the tumor associated antigen, TPD52, is an ideal target for a therapeutic cancer vaccine. This designation was due to the overexpression of TPD52 in a variety of different cancer types. In order to start the development of a therapeutic cancer vaccine for TPD52-related cancers, we have devised a two-step strategy. First, we plan to create a list of potential TPD52 epitopes by using epitope binding and processing prediction tools. Second, we plan to attempt to experimentally identify MHC class I TPD52 epitopes in vitro. We identified 942 potential 9 and 10 amino acid epitopes for the HLAs A1, A2, A3, A11, A24, B07, B27, B35, B44. These epitopes were predicted by using a combination of 3 binding prediction tools and 2 processing prediction tools. From these 942 potential epitopes, we selected the top 50 epitopes ranked by a combination of binding and processing scores. Due to the promiscuity of some predicted epitopes for multiple HLAs, we ordered 38 synthetic epitopes from the list of the top 50 epitope. We also performed a frequency analysis of the TPD52 protein sequence and identified 3 high volume regions of high epitope production. After the epitope predictions were completed, we proceeded to attempt to experimentally detected presented TPD52 epitopes. First, we successful transduced parental K562 cells with TPD52. After transduction, we started the optimization process for the immunoprecipitation protocol. The optimization of the immunoprecipitation protocol proved to be more difficult than originally believed and was the main reason that we were unable to progress past the transduction of the parental cells. However, we believe that we have identified the issues and will be able to complete the experiment in the coming months.
ContributorsWilson, Eric Andrew (Author) / Anderson, Karen (Thesis director) / Borges, Chad (Committee member) / School of Molecular Sciences (Contributor) / Barrett, The Honors College (Contributor)
Created2016-05
135359-Thumbnail Image.png
Description
Background: Noninvasive MRI methods that can accurately detect subtle brain changes are highly desirable when studying disease-modifying interventions. Texture analysis is a novel imaging technique which utilizes the extraction of a large number of image features with high specificity and predictive power. In this investigation, we use texture analysis to

Background: Noninvasive MRI methods that can accurately detect subtle brain changes are highly desirable when studying disease-modifying interventions. Texture analysis is a novel imaging technique which utilizes the extraction of a large number of image features with high specificity and predictive power. In this investigation, we use texture analysis to assess and classify age-related changes in the right and left hippocampal regions, the areas known to show some of the earliest change in Alzheimer's disease (AD). Apolipoprotein E (APOE)'s e4 allele confers an increased risk for AD, so studying differences in APOE e4 carriers may help to ascertain subtle brain changes before there has been an obvious change in behavior. We examined texture analysis measures that predict age-related changes, which reflect atrophy in a group of cognitively normal individuals. We hypothesized that the APOE e4 carriers would exhibit significant age-related differences in texture features compared to non-carriers, so that the predictive texture features hold promise for early assessment of AD. Methods: 120 normal adults between the ages of 32 and 90 were recruited for this neuroimaging study from a larger parent study at Mayo Clinic Arizona studying longitudinal cognitive functioning (Caselli et al., 2009). As part of the parent study, the participants were genotyped for APOE genetic polymorphisms and received comprehensive cognitive testing every two years, on average. Neuroimaging was done at Barrow Neurological Institute and a 3D T1-weighted magnetic resonance image was obtained during scanning that allowed for subsequent texture analysis processing. Voxel-based features of the appearance, structure, and arrangement of these regions of interest were extracted utilizing the Mayo Clinic Python Texture Analysis Pipeline (pyTAP). Algorithms applied in feature extraction included Grey-Level Co-Occurrence Matrix (GLCM), Gabor Filter Banks (GFB), Local Binary Patterns (LBP), Discrete Orthogonal Stockwell Transform (DOST), and Laplacian-of-Gaussian Histograms (LoGH). Principal component (PC) analysis was used to reduce the dimensionality of the algorithmically selected features to 13 PCs. A stepwise forward regression model was used to determine the effect of APOE status (APOE e4 carriers vs. noncarriers), and the texture feature principal components on age (as a continuous variable). After identification of 5 significant predictors of age in the model, the individual feature coefficients of those principal components were examined to determine which features contributed most significantly to the prediction of an aging brain. Results: 70 texture features were extracted for the two regions of interest in each participant's scan. The texture features were coded as 70 initial components andwere rotated to generate 13 principal components (PC) that contributed 75% of the variance in the dataset by scree plot analysis. The forward stepwise regression model used in this exploratory study significantly predicted age, accounting for approximately 40% of the variance in the data. The regression model revealed 5 significant regressors (2 right PC's, APOE status, and 2 left PC by APOE interactions). Finally, the specific texture features that contributed to each significant PCs were identified. Conclusion: Analysis of image texture features resulted in a statistical model that was able to detect subtle changes in brain integrity associated with age in a group of participants who are cognitively normal, but have an increased risk of developing AD based on the presence of the APOE e4 phenotype. This is an important finding, given that detecting subtle changes in regions vulnerable to the effects of AD in patients could allow certain texture features to serve as noninvasive, sensitive biomarkers predictive of AD. Even with only a small number of patients, the ability for us to determine sensitive imaging biomarkers could facilitate great improvement in speed of detection and effectiveness of AD interventions..
ContributorsSilva, Annelise Michelle (Author) / Baxter, Leslie (Thesis director) / McBeath, Michael (Committee member) / Presson, Clark (Committee member) / School of Life Sciences (Contributor) / Department of Psychology (Contributor) / Barrett, The Honors College (Contributor)
Created2016-05
135454-Thumbnail Image.png
Description
Mammary gland development in humans during puberty involves the enlargement of breast tissue, but this is not true in non-human primates. To identify potential causes of this difference, I examined variation in substitution rates across genes related to mammary development. Genes undergoing purifying selection show slower-than-average substitution rates, while genes

Mammary gland development in humans during puberty involves the enlargement of breast tissue, but this is not true in non-human primates. To identify potential causes of this difference, I examined variation in substitution rates across genes related to mammary development. Genes undergoing purifying selection show slower-than-average substitution rates, while genes undergoing positive selection show faster rates. These may be related to the difference between humans and other primates. Three genes were found to be accelerated were FOXF1, IGFBP5, and ATP2B2, but only the latter one was found in humans and it seems unlikely that it would be related to the differences between mammary gland development at puberty between humans and non-human primates.
ContributorsArroyo, Diana (Author) / Cartwright, Reed (Thesis director) / Wilson Sayres, Melissa (Committee member) / Schwartz, Rachel (Committee member) / School of Life Sciences (Contributor) / Barrett, The Honors College (Contributor)
Created2016-05
136684-Thumbnail Image.png
Description
microRNAs (miRNAs) are short ~22nt non-coding RNAs that regulate gene output at the post-transcriptional level. Via targeting of degenerate elements primarily in 3'untranslated regions (3'UTR) of mRNAs, miRNAs can target thousands of varying genes and suppress their protein translation. The precise mechanistic function and bio- logical role of miRNAs is

microRNAs (miRNAs) are short ~22nt non-coding RNAs that regulate gene output at the post-transcriptional level. Via targeting of degenerate elements primarily in 3'untranslated regions (3'UTR) of mRNAs, miRNAs can target thousands of varying genes and suppress their protein translation. The precise mechanistic function and bio- logical role of miRNAs is not fully understood and yet it is a major contributor to a pleth- ora of diseases, including neurological disorders, muscular disorders, and cancer. Cer- tain model organisms are valuable in understanding the function of miRNA and there- fore fully understanding the biological significance of miRNA targeting. Here I report a mechanistic analysis of miRNA targeting in C. elegans, and a bioinformatic approach to aid in further investigation of miRNA targeted sequences. A few of the biologically significant mechanisms discussed in this thesis include alternative polyadenylation, RNA binding proteins, components of the miRNA recognition machinery, miRNA secondary structures, and their polymorphisms. This thesis also discusses a novel bioinformatic approach to studying miRNA biology, including computational miRNA target prediction software, and sequence complementarity. This thesis allows a better understanding of miRNA biology and presents an ideal strategy for approaching future research in miRNA targeting.
ContributorsWeigele, Dustin Keith (Author) / Mangone, Marco (Thesis director) / Katchman, Benjamin (Committee member) / Barrett, The Honors College (Contributor) / Department of Chemistry and Biochemistry (Contributor) / School of Life Sciences (Contributor)
Created2014-12
132823-Thumbnail Image.png
Description
Schizophrenia is a disease that affects 15.2/100,000 US citizens, with about 0.6-1.9% of the total population being afflicted with some range of severity of the disease. A lot of research has been done on the progression of the disease and its differences between males and females; however, the true underlying

Schizophrenia is a disease that affects 15.2/100,000 US citizens, with about 0.6-1.9% of the total population being afflicted with some range of severity of the disease. A lot of research has been done on the progression of the disease and its differences between males and females; however, the true underlying cause of the disease remains unknown. In the literature, however, there is a lot of indication that a genetic cause for schizophrenia is the primary origin for the disorder. In order to establish a foundation in differential gene expression and isoform expression between males and females, we utilized the Genotype-Tissue Expression Project data set (which contains samples from healthy individuals at their time of death) for the amygdala, anterior cingulate cortex, and frontal cortex. We performed quality control on the data with Trimmomatic and visualized it with FastQC and MultiQC. We then aligned to a sex-specific reference genome with Hisat2. Finally, we performed a differential expression analysis dthrough the limma/voom package with inputs from featureCounts. An isoform level analysis was run on the anterior cingulate cortex with the IsoformSwitchAnalyzeR package. We were able to identify a few differentially expressed genes in the three tissue sites, which included XIST and other highly conserved, Y-linked genes. As for the isoform level analysis, we were able to identify 13 genes with significant levels of differential isoform usage and expression, two of which have clinical relevance (DAB1 and PACRG). These findings will allow for a comparison to be made by future studies on gene expression in brain tissue samples from patients that had been diagnosed with schizophrenia in their life. By identifying any unique genes in these patients, gene therapies can be developed to target and correct any misexpression that may be occurring.
ContributorsEvanovich, Austin Phillip (Author) / Wilson, Melissa (Thesis director) / Buetow, Kenneth (Committee member) / Natri, Heini Maaret (Committee member) / School of Life Sciences (Contributor, Contributor) / Barrett, The Honors College (Contributor)
Created2019-05
137143-Thumbnail Image.png
Description
Methane (CH4) is very important in the environment as it is a greenhouse gas and important for the degradation of organic matter. During the last 200 years the atmospheric concentration of CH4 has tripled. Methanogens are methane-producing microbes from the Archaea domain that complete the final step in breaking down

Methane (CH4) is very important in the environment as it is a greenhouse gas and important for the degradation of organic matter. During the last 200 years the atmospheric concentration of CH4 has tripled. Methanogens are methane-producing microbes from the Archaea domain that complete the final step in breaking down organic matter to generate methane through a process called methanogenesis. They contribute to about 74% of the CH4 present on the Earth's atmosphere, producing 1 billion tons of methane annually. The purpose of this work is to generate a preliminary metabolic reconstruction model of two methanogens: Methanoregula boonei 6A8 and Methanosphaerula palustris E1-9c. M. boonei and M. palustris are part of the Methanomicrobiales order and perform hydrogenotrophic methanogenesis, which means that they reduce CO2 to CH4 by using H2 as their major electron donor. Metabolic models are frameworks for understanding a cell as a system and they provide the means to assess the changes in gene regulation in response in various environmental and physiological constraints. The Pathway-Tools software v16 was used to generate these draft models. The models were manually curated using literature searches, the KEGG database and homology methods with the Methanosarcina acetivorans strain, the closest methanogen strain with a nearly complete metabolic reconstruction. These preliminary models attempt to complete the pathways required for amino acid biosynthesis, methanogenesis, and major cofactors related to methanogenesis. The M. boonei reconstruction currently includes 99 pathways and has 82% of its reactions completed, while the M. palustris reconstruction includes 102 pathways and has 89% of its reactions completed.
ContributorsMahendra, Divya (Author) / Cadillo-Quiroz, Hinsby (Thesis director) / Wang, Xuan (Committee member) / Stout, Valerie (Committee member) / Barrett, The Honors College (Contributor) / Computing and Informatics Program (Contributor) / School of Life Sciences (Contributor) / Biomedical Informatics Program (Contributor)
Created2014-05
134629-Thumbnail Image.png
Description
Valley Fever, also known as coccidioidomycosis, is a respiratory disease that affects 10,000 people annually, primarily in Arizona and California. Due to a lack of gene annotation, diagnosis and treatment of Valley Fever is severely limited. In turn, gene annotation efforts are also hampered by incomplete genome sequencing. We intend

Valley Fever, also known as coccidioidomycosis, is a respiratory disease that affects 10,000 people annually, primarily in Arizona and California. Due to a lack of gene annotation, diagnosis and treatment of Valley Fever is severely limited. In turn, gene annotation efforts are also hampered by incomplete genome sequencing. We intend to use proteogenomic analysis to reannotate the Coccidioides posadasii str. Silveira genome from protein-level data. Protein samples extracted from both phases of Silveira were fragmented into peptides, sequenced, and compared against databases of known and predicted proteins sequences, as well as a de novo six-frame translation of the genome. 288 unique peptides were located that did not match a known Silveira annotation, and of those 169 were associated with another Coccidioides strain. Additionally, 17 peptides were found at the boundary of, or outside of, the current gene annotation comprising four distinct clusters. For one of these clusters, we were able to calculate a lower bound and an estimate for the size of the gap between two Silveira contigs using the Coccidioides immitis RS transcript associated with that cluster's peptides \u2014 these predictions were consistent with the current annotation's scaffold structure. Three peptides were associated with an actively translated transposon, and a putative active site was located within an intact LTR retrotransposon. We note that gene annotation is necessarily hindered by the quality and level of detail in prior genome sequencing efforts, and recommend that future studies involving reannotation include additional sequencing as well as gene annotation via proteogenomics or other methods.
ContributorsSherrard, Andrew (Author) / Lake, Douglas (Thesis director) / Grys, Thomas (Committee member) / Mitchell, Natalie (Committee member) / Computing and Informatics Program (Contributor) / School of Life Sciences (Contributor) / Barrett, The Honors College (Contributor)
Created2016-12
134524-Thumbnail Image.png
Description
With the rising data output and falling costs of Next Generation Sequencing technologies, research into data compression is crucial to maintaining storage efficiency and costs. High throughput sequencers such as the HiSeqX Ten can produce up to 1.8 terabases of data per run, and such large storage demands are even

With the rising data output and falling costs of Next Generation Sequencing technologies, research into data compression is crucial to maintaining storage efficiency and costs. High throughput sequencers such as the HiSeqX Ten can produce up to 1.8 terabases of data per run, and such large storage demands are even more important to consider for institutions that rely on their own servers rather than large data centers (cloud storage)1. Compression algorithms aim to reduce the amount of space taken up by large genomic datasets by encoding the most frequently occurring symbols with the shortest bit codewords and by changing the order of the data to make it easier to encode. Depending on the probability distribution of the symbols in the dataset or the structure of the data, choosing the wrong algorithm could result in a compressed file larger than the original or a poorly compressed file that results in a waste of time and space2. To test efficiency among compression algorithms for each file type, 37 open-source compression algorithms were used to compress six types of genomic datasets (FASTA, VCF, BCF, GFF, GTF, and SAM) and evaluated on compression speed, decompression speed, compression ratio, and file size using the benchmark test lzbench. Compressors that outpreformed the popular bioinformatics compressor Gzip (zlib -6) were evaluated against one another by ratio and speed for each file type and across the geometric means of all file types. Compressors that exhibited fast compression and decompression speeds were also evaluated by transmission time through variable speed internet pipes in scenarios where the file was compressed only once or compressed multiple times.
ContributorsHowell, Abigail (Author) / Cartwright, Reed (Thesis director) / Wilson Sayres, Melissa (Committee member) / Taylor, Jay (Committee member) / Barrett, The Honors College (Contributor)
Created2017-05
133551-Thumbnail Image.png
Description
I, Christopher Negrich, am the sole author of this paper, but the tools described were designed in collaboration with Andrew Hoetker. ConstrictR (constrictor) and ConstrictPy are an R package and python tool designed together. ConstrictPy implements the functions and methods defined in ConstrictR and applies data handling, data parsing, input/output

I, Christopher Negrich, am the sole author of this paper, but the tools described were designed in collaboration with Andrew Hoetker. ConstrictR (constrictor) and ConstrictPy are an R package and python tool designed together. ConstrictPy implements the functions and methods defined in ConstrictR and applies data handling, data parsing, input/output (I/O), and a user interface to increase usability. ConstrictR implements a variety of common data analysis methods used for statistical and subnetwork analysis. The majority of these methods are inspired by Lionel Guidi's 2016 paper, Plankton networks driving carbon export in the oligotrophic ocean. Additional methods were added to expand functionality, usability, and applicability to different areas of data science. Both ConstrictR and ConstrictPy are currently publicly available and usable, however, they are both ongoing projects. ConstrictR is available at github.com/cnegrich and ConstrictPy is available at github.com/ahoetker. Currently, ConstrictR has implemented functions for descriptive statistics, correlation, covariance, rank, sparsity, and weighted correlation network analysis with clustering, centrality, profiling, error handling, and data parsing methods to be released soon. ConstrictPy has fully implemented and integrated the features in ConstrictR as well as created functions for I/O and conversion between pandas and R data frames with a full feature user interface to be released soon. Both ConstrictR and ConstrictPy are designed to work with minimal dependencies and maximum available information on the algorithms implemented. As a result, ConstrictR is only dependent on base R (v3.4.4) functions with no libraries imported. ConstrictPy is dependent upon only pandas, Rpy2, and ConstrictR. This was done to increase longevity and independence of these tools. Additionally, all mathematical information is documented alongside the code, increasing the available information on how these tools function. Although neither tool is in its final version, this paper documents the code, mathematics, and instructions for use, in addition to plans for future work, for of the current versions of ConstrictR (v0.0.1) and ConstrictPy (v0.0.1).
ContributorsNegrich, Christopher Alec (Author) / Can, Huansheng (Thesis director) / Hansford, Dianne (Committee member) / School of Mathematical and Statistical Sciences (Contributor) / Barrett, The Honors College (Contributor)
Created2018-05