Search Content

miRNA Targeting: In depth review of biologically significant mechanisms and a bioinformatic approach to identifying targeting sequences in C. elegans

Description

microRNAs (miRNAs) are short ~22nt non-coding RNAs that regulate gene output at the post-transcriptional level. Via targeting of degenerate elements primarily in 3'untranslated regions (3'UTR) of mRNAs, miRNAs can target thousands of varying genes and suppress their protein translation. The precise mechanistic function and bio- logical role of miRNAs is…

microRNAs (miRNAs) are short ~22nt non-coding RNAs that regulate gene output at the post-transcriptional level. Via targeting of degenerate elements primarily in 3'untranslated regions (3'UTR) of mRNAs, miRNAs can target thousands of varying genes and suppress their protein translation. The precise mechanistic function and bio- logical role of miRNAs is not fully understood and yet it is a major contributor to a pleth- ora of diseases, including neurological disorders, muscular disorders, and cancer. Cer- tain model organisms are valuable in understanding the function of miRNA and there- fore fully understanding the biological significance of miRNA targeting. Here I report a mechanistic analysis of miRNA targeting in C. elegans, and a bioinformatic approach to aid in further investigation of miRNA targeted sequences. A few of the biologically significant mechanisms discussed in this thesis include alternative polyadenylation, RNA binding proteins, components of the miRNA recognition machinery, miRNA secondary structures, and their polymorphisms. This thesis also discusses a novel bioinformatic approach to studying miRNA biology, including computational miRNA target prediction software, and sequence complementarity. This thesis allows a better understanding of miRNA biology and presents an ideal strategy for approaching future research in miRNA targeting.

ContributorsWeigele, Dustin Keith (Author) / Mangone, Marco (Thesis director) / Katchman, Benjamin (Committee member) / Barrett, The Honors College (Contributor) / Department of Chemistry and Biochemistry (Contributor) / School of Life Sciences (Contributor)

Created2014-12

Preliminary Metabolic Reconstruction of Two Methane Producing Microbes: Methanoregula boonei 6A8 and Methanosphaerula palustris E1-9c

Description

Methane (CH4) is very important in the environment as it is a greenhouse gas and important for the degradation of organic matter. During the last 200 years the atmospheric concentration of CH4 has tripled. Methanogens are methane-producing microbes from the Archaea domain that complete the final step in breaking down…

Methane (CH4) is very important in the environment as it is a greenhouse gas and important for the degradation of organic matter. During the last 200 years the atmospheric concentration of CH4 has tripled. Methanogens are methane-producing microbes from the Archaea domain that complete the final step in breaking down organic matter to generate methane through a process called methanogenesis. They contribute to about 74% of the CH4 present on the Earth's atmosphere, producing 1 billion tons of methane annually. The purpose of this work is to generate a preliminary metabolic reconstruction model of two methanogens: Methanoregula boonei 6A8 and Methanosphaerula palustris E1-9c. M. boonei and M. palustris are part of the Methanomicrobiales order and perform hydrogenotrophic methanogenesis, which means that they reduce CO2 to CH4 by using H2 as their major electron donor. Metabolic models are frameworks for understanding a cell as a system and they provide the means to assess the changes in gene regulation in response in various environmental and physiological constraints. The Pathway-Tools software v16 was used to generate these draft models. The models were manually curated using literature searches, the KEGG database and homology methods with the Methanosarcina acetivorans strain, the closest methanogen strain with a nearly complete metabolic reconstruction. These preliminary models attempt to complete the pathways required for amino acid biosynthesis, methanogenesis, and major cofactors related to methanogenesis. The M. boonei reconstruction currently includes 99 pathways and has 82% of its reactions completed, while the M. palustris reconstruction includes 102 pathways and has 89% of its reactions completed.

ContributorsMahendra, Divya (Author) / Cadillo-Quiroz, Hinsby (Thesis director) / Wang, Xuan (Committee member) / Stout, Valerie (Committee member) / Barrett, The Honors College (Contributor) / Computing and Informatics Program (Contributor) / School of Life Sciences (Contributor) / Biomedical Informatics Program (Contributor)

Created2014-05

A review of pathway-based visualization and quantification analysis tools using microarray data

Description

Pathway analysis helps researchers gain insight into the biology behind gene expression-based data. By applying this data to known biological pathways, we can learn about mutations or other changes in cellular function, such as those seen in cancer. There are many tools that can be used to analyze pathways; however,…

Pathway analysis helps researchers gain insight into the biology behind gene expression-based data. By applying this data to known biological pathways, we can learn about mutations or other changes in cellular function, such as those seen in cancer. There are many tools that can be used to analyze pathways; however, it can be difficult to find and learn about the which tool is optimal for use in a certain experiment. This thesis aims to comprehensively review four tools, Cytoscape, PaxtoolsR, PathOlogist, and Reactome, and their role in pathway analysis. This is done by applying a known microarray data set to each tool and testing their different functions. The functions of these programs will then be analyzed to determine their roles in learning about biology and assisting new researchers with their experiments. It was found that each tools holds a very unique and important role in pathway analysis. Visualization pathways have the role of exploring individual pathways and interpreting genomic results. Quantification pathways use statistical tests to determine pathway significance. Together one can find pathways of interest and then explore areas of interest.

ContributorsRehling, Thomas Evan (Author) / Buetow, Kenneth (Thesis director) / Wilson, Melissa (Committee member) / School of Life Sciences (Contributor, Contributor) / Barrett, The Honors College (Contributor)

Created2020-05

Differential Gene Expression in Type II Diabetes

Description

This research project investigated known and novel differential genetic variants and their associated molecular pathways involved in Type II diabetes mellitus for the purpose of improving diagnosis and treatment methods. The goal of this investigation was to 1) identify the genetic variants and SNPs in Type II diabetes to develo…

This research project investigated known and novel differential genetic variants and their associated molecular pathways involved in Type II diabetes mellitus for the purpose of improving diagnosis and treatment methods. The goal of this investigation was to 1) identify the genetic variants and SNPs in Type II diabetes to develop a gene regulatory pathway, and 2) utilize this pathway to determine suitable drug therapeutics for prevention and treatment. Using a Gene Set Enrichment Analysis (GSEA), a set of 1000 gene identifiers from a Mayo Clinic database was analyzed to determine the most significant genetic variants related to insulin signaling pathways involved in Type II Diabetes. The following genes were identified: NRAS, KRAS, PIK3CA, PDE3B, TSC1, AKT3, SOS1, NEU1, PRKAA2, AMPK, and ACC. In an extensive literature review and cross-analysis with Kegg and Reactome pathway databases, novel SNPs located on these gene variants were identified and used to determine suitable drug therapeutics for treatment. Overall, understanding how genetic mutations affect target gene function related to Type II Diabetes disease pathology is crucial to the development of effective diagnosis and treatment. This project provides new insight into the molecular basis of the Type II Diabetes, serving to help untangle the regulatory complexity of the disease and aid in the advancement of diagnosis and treatment. Keywords: Type II Diabetes mellitus, Gene Set Enrichment Analysis, genetic variants, KEGG Insulin Pathway, gene-regulatory pathway

ContributorsBucklin, Lindsay (Co-author) / Davis, Vanessa (Co-author) / Holechek, Susan (Thesis director) / Wang, Junwen (Committee member) / Nyarige, Verah (Committee member) / School of Human Evolution & Social Change (Contributor) / School of Life Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2019-05

Evaluating variant calling best practices

Description

Analyzing human DNA sequence data allows researchers to identify variants associated with disease, reconstruct the demographic histories of human populations, and further understand the structure and function of the genome. Identifying variants in whole genome sequences is a crucial bioinformatics step in sequence data processing and can be performed using…

Analyzing human DNA sequence data allows researchers to identify variants associated with disease, reconstruct the demographic histories of human populations, and further understand the structure and function of the genome. Identifying variants in whole genome sequences is a crucial bioinformatics step in sequence data processing and can be performed using multiple approaches. To investigate the consistency between different bioinformatics methods, we compared the accuracy and sensitivity of two genotyping strategies, joint variant calling and single-sample variant calling. Autosomal and sex chromosome variant call sets were produced by joint and single-sample calling variants for 10 female individuals. The accuracy of variant calls was assessed using SNP array genotype data collected from each individual. To compare the ability of joint and single-sample calling to capture low-frequency variants, folded site frequency spectra were constructed from variant call sets. To investigate the potential for these different variant calling methods to impact downstream analyses, we estimated nucleotide diversity for call sets produced using each approach. We found that while both methods were equally accurate when validated by SNP array sites, single-sample calling identified a greater number of singletons. However, estimates of nucleotide diversity were robust to these differences in the site frequency spectrum between call sets. Our results suggest that despite single-sample calling’s greater sensitivity for low-frequency variants, the differences between approaches have a minimal effect on downstream analyses. While joint calling may be a more efficient approach for genotyping many samples, in situations that preclude large sample sizes, our study suggests that single-sample calling is a suitable alternative.

ContributorsHowell, Emma (Co-author) / Wilson, Melissa (Thesis director) / Stone, Anne (Committee member) / Phung, Tanya (Committee member) / School of Life Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2020-05

Conservation of m6A in evolving long-term E. coli populations

Description

Many factors are at play within the genome of an organism, contributing to much of the diversity and variation across the tree of life. While the genome is generally encoded by four nucleotides, A, C, T, and G, this code can be expanded. One particular mechanism that we examine in…

Many factors are at play within the genome of an organism, contributing to much of the diversity and variation across the tree of life. While the genome is generally encoded by four nucleotides, A, C, T, and G, this code can be expanded. One particular mechanism that we examine in this thesis is modification of bases—more specifically, methylation of Adenine (m6A) within the GATC motif of Escherichia coli. These methylated adenines are especially important in a process called methyl-directed mismatch repair (MMR), a pathway responsible for repairing errors in the DNA sequence produced by replication. In this pathway, methylated adenines identify the parent strand and direct the repair proteins to correct the erroneous base in the daughter strand. While the primary role of methylated adenines at GATC sites is to direct the MMR pathway, this methylation has also been found to affect other processes, such as gene expression, the activity of transposable elements, and the timing of DNA replication. However, in the absence of MMR, the ability of these other processes to maintain adenine methylation and its targets is unknown.
To determine if the disruption of the MMR pathway results in the reduced conservation of methylated adenines as well as an increased tolerance for mutations that result in the loss or gain of new GATC sites, we surveyed individual clones isolated from experimentally evolving wild-type and MMR-deficient (mutL- ;conferring an 150x increase in mutation rate) populations of E. coli with whole-genome sequencing. Initial analysis revealed a lack of mutations affecting methylation sites (GATC tetranucleotides) in wild-type clones. However, the inherent low mutation rates conferred by the wild-type background render this result inconclusive, due to a lack of statistical power, and reveal a need for a more direct measure of changes in methylation status. Thus as a first step to comparative methylomics, we benchmarked four different methylation-calling pipelines on three biological replicates of the wildtype progenitor strain for our evolved populations.
While it is understood that these methylated sites play a role in the MMR pathway, it is not fully understood the full extent of their effect on the genome. Thus the goal of this thesis was to better understand the forces which maintain the genome, specifically concerning m6A within the GATC motif.

ContributorsBoyer, Gwyneth (Author) / Lynch, Michael (Thesis director) / Behringer, Megan (Committee member) / Geiler-Samerotte, Kerry (Committee member) / School of Life Sciences (Contributor) / Department of Psychology (Contributor) / Barrett, The Honors College (Contributor)

Created2020-05

Beginning to investigate Lactase Persistence in Turkana

Description

Lactase persistence is the ability of adults to digest lactose in milk (Segurel & Bon, 2017). Mammals are generally distinguished by their mammary glands which gives females the ability to produce milk and feed their newborn children. The new born therefore requires the ability to breakdown the lactose in the…

Lactase persistence is the ability of adults to digest lactose in milk (Segurel & Bon, 2017). Mammals are generally distinguished by their mammary glands which gives females the ability to produce milk and feed their newborn children. The new born therefore requires the ability to breakdown the lactose in the milk to ensure its proper digestion (Segurel & Bon, 2017). Generally, humans lose the expression of lactase after weaning, which prevents them being able to breakdown lactose from dairy (Flatz, 1987).
My research is focused on the people of Turkana, a human pastoral population inhabiting Northwest Kenya. The people of Turkana are Nilotic people that are native to the Turkana district. There are currently no conclusive studies done on evidence for genetic lactase persistence in Turkana. Therefore, my research will be on the evolution of lactase persistence in the people of Turkana. The goal of this project is to investigate the evolutionary history of two genes with known involvement in lactase persistence, LCT and MCM6, in the Turkana. Variants in these genes have previously been identified to result in the ability to digest lactose post-weaning age. Furthermore, an additional study found that a closely related population to the Turkana, the Massai, showed stronger signals of recent selection for lactase persistence than Europeans in these genes. My goal is to characterize known variants associated with lactase persistence by calculating their allele frequencies in the Turkana and conduct selection scans to determine if LCT/MCM6 show signatures of positive selection. In doing this, we conducted a pilot study consisting of 10 female Turkana individuals and 10 females from four different populations from the 1000 genomes project namely: the Yoruba in Ibadan, Nigeria (YRI); Luhya in Webuye, Kenya; Utah Residents with Northern and Western European Ancestry (CEU); and the Southern Han Chinese. The allele frequency calculation suggested that the CEU (Utah Residents with Northern and Western European Ancestry) population had a higher lactase persistence associated allele frequency than all the other populations analyzed here, including the Turkana population. Our Tajima’s D calculations and analysis suggested that both the Turkana population and the four haplotype map populations shows signatures of positive selection in the same region. The iHS selection scans we conducted to detect signatures of positive selection on all five populations showed that the Southern Han Chinese (CHS), the LWK (Luhya in Webuye, Kenya) and the YRI (Yoruba in Ibadan, Nigeria) populations had stronger signatures of positive selection than the Turkana population. The LWK (Luhya in Webuye, Kenya) and the YRI (Yoruba in Ibadan, Nigeria) populations showed the strongest signatures of positive selection in this region. This project serves as a first step in the investigation of lactase persistence in the Turkana population and its evolution over time.

ContributorsJobe, Ndey Bassin (Author) / Wilson Sayres, Melissa (Thesis director) / Paaijmans, Krijn (Committee member) / Taravella, Angela (Committee member) / School of Earth and Space Exploration (Contributor) / School of Life Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2019-05

Patterns of Sex-Biased Gene Expression in the Human Brain

Description

Schizophrenia is a disease that affects 15.2/100,000 US citizens, with about 0.6-1.9% of the total population being afflicted with some range of severity of the disease. A lot of research has been done on the progression of the disease and its differences between males and females; however, the true underlying…

Schizophrenia is a disease that affects 15.2/100,000 US citizens, with about 0.6-1.9% of the total population being afflicted with some range of severity of the disease. A lot of research has been done on the progression of the disease and its differences between males and females; however, the true underlying cause of the disease remains unknown. In the literature, however, there is a lot of indication that a genetic cause for schizophrenia is the primary origin for the disorder. In order to establish a foundation in differential gene expression and isoform expression between males and females, we utilized the Genotype-Tissue Expression Project data set (which contains samples from healthy individuals at their time of death) for the amygdala, anterior cingulate cortex, and frontal cortex. We performed quality control on the data with Trimmomatic and visualized it with FastQC and MultiQC. We then aligned to a sex-specific reference genome with Hisat2. Finally, we performed a differential expression analysis dthrough the limma/voom package with inputs from featureCounts. An isoform level analysis was run on the anterior cingulate cortex with the IsoformSwitchAnalyzeR package. We were able to identify a few differentially expressed genes in the three tissue sites, which included XIST and other highly conserved, Y-linked genes. As for the isoform level analysis, we were able to identify 13 genes with significant levels of differential isoform usage and expression, two of which have clinical relevance (DAB1 and PACRG). These findings will allow for a comparison to be made by future studies on gene expression in brain tissue samples from patients that had been diagnosed with schizophrenia in their life. By identifying any unique genes in these patients, gene therapies can be developed to target and correct any misexpression that may be occurring.

ContributorsEvanovich, Austin Phillip (Author) / Wilson, Melissa (Thesis director) / Buetow, Kenneth (Committee member) / Natri, Heini Maaret (Committee member) / School of Life Sciences (Contributor, Contributor) / Barrett, The Honors College (Contributor)

Created2019-05

Identifying Novel Nanobodies for Traumatic Brain Injury Therapeutics

Description

Traumatic brain injury (TBI) is a serious health problem around the world with few available treatments. TBI pathology can be divided into two phases: the primary insult and the secondary injury. The primary insult results from the bump or blow to the head that causes the initial injury. Secondary injury…

Traumatic brain injury (TBI) is a serious health problem around the world with few available treatments. TBI pathology can be divided into two phases: the primary insult and the secondary injury. The primary insult results from the bump or blow to the head that causes the initial injury. Secondary injury lasts from hours to months after the initial injury and worsens the primary insult, creating a greater area of tissue damage and cell death. Many current treatments focus on lessening the severity of secondary injury. Secondary injury results from the cyclical nature of tissue damage. Inflammatory pathways cause damage to tissue, which in turn reinforces inflammation. Since many inflammatory pathways are interconnected, targeting individual products within these pathways is impractical. A target at the beginning of the pathway, such as a receptor, must be chosen to break the cycle. This project aims to identify novel nanobodies that could temporarily inactivate the CD36 receptor, which is a receptor found on many immune and endothelial cells. CD36 initiates and perpetuates the immune system's inflammatory responses. By inactivating this receptor temporarily, inflammation and immune cell entry could be lessened, and therefore secondary injury could be attenuated. This project utilized phage display as a method of nanobody selection. The specific phage library utilized in this experiment consists of human heavy chain (V_H) segments, also known as domain antibodies (dAbs), displayed on M13 filamentous bacteriophage. Phage display mimics the process of immune selection. The target is bound to a well as a means of displaying it to the phage. The phage library is then incubated with the target to allow antibodies to bind. After, the well is washed thoroughly to detach any phage that are not strongly bound. The remaining phage are then amplified in bacteria and run again through the same assay to select for mutations that resulted in higher affinity binding. This process, called biopanning, was performed three times for this project. After biopanning, the library was sequenced using Next Generation sequencing (NGS). This platform enables the entire library to be sequenced, as opposed to traditional Sanger sequencing, which can only sequence single select clones at a time thereby limiting population sampling. This type of genetic sequencing allows trends in the complementarity determining regions (CDRs) of the domain antibody library to be analyzed, using bioinformatics programs such as RStudio, FastAptamer, and Swiss Model. Ultimately, two nanobody candidates were identified for the CD36 receptor.

ContributorsLundgreen, Kendall (Author) / Stabenfeldt, Sarah (Thesis director) / Ugarova, Tatiana (Committee member) / School of Life Sciences (Contributor) / School of International Letters and Cultures (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

Improving the Valley Fever Gene Annotation Through Proteogenomic Analysis

Description

Valley Fever, also known as coccidioidomycosis, is a respiratory disease that affects 10,000 people annually, primarily in Arizona and California. Due to a lack of gene annotation, diagnosis and treatment of Valley Fever is severely limited. In turn, gene annotation efforts are also hampered by incomplete genome sequencing. We intend…

Valley Fever, also known as coccidioidomycosis, is a respiratory disease that affects 10,000 people annually, primarily in Arizona and California. Due to a lack of gene annotation, diagnosis and treatment of Valley Fever is severely limited. In turn, gene annotation efforts are also hampered by incomplete genome sequencing. We intend to use proteogenomic analysis to reannotate the Coccidioides posadasii str. Silveira genome from protein-level data. Protein samples extracted from both phases of Silveira were fragmented into peptides, sequenced, and compared against databases of known and predicted proteins sequences, as well as a de novo six-frame translation of the genome. 288 unique peptides were located that did not match a known Silveira annotation, and of those 169 were associated with another Coccidioides strain. Additionally, 17 peptides were found at the boundary of, or outside of, the current gene annotation comprising four distinct clusters. For one of these clusters, we were able to calculate a lower bound and an estimate for the size of the gap between two Silveira contigs using the Coccidioides immitis RS transcript associated with that cluster's peptides \u2014 these predictions were consistent with the current annotation's scaffold structure. Three peptides were associated with an actively translated transposon, and a putative active site was located within an intact LTR retrotransposon. We note that gene annotation is necessarily hindered by the quality and level of detail in prior genome sequencing efforts, and recommend that future studies involving reannotation include additional sequencing as well as gene annotation via proteogenomics or other methods.

ContributorsSherrard, Andrew (Author) / Lake, Douglas (Thesis director) / Grys, Thomas (Committee member) / Mitchell, Natalie (Committee member) / Computing and Informatics Program (Contributor) / School of Life Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2016-12

Filtering by