Matching Items (119)

133254-Thumbnail Image.png

Identifying Novel Nanobodies for Traumatic Brain Injury Therapeutics

Description

Traumatic brain injury (TBI) is a serious health problem around the world with few available treatments. TBI pathology can be divided into two phases: the primary insult and the secondary injury. The primary insult results from the bump or blow

Traumatic brain injury (TBI) is a serious health problem around the world with few available treatments. TBI pathology can be divided into two phases: the primary insult and the secondary injury. The primary insult results from the bump or blow to the head that causes the initial injury. Secondary injury lasts from hours to months after the initial injury and worsens the primary insult, creating a greater area of tissue damage and cell death. Many current treatments focus on lessening the severity of secondary injury. Secondary injury results from the cyclical nature of tissue damage. Inflammatory pathways cause damage to tissue, which in turn reinforces inflammation. Since many inflammatory pathways are interconnected, targeting individual products within these pathways is impractical. A target at the beginning of the pathway, such as a receptor, must be chosen to break the cycle. This project aims to identify novel nanobodies that could temporarily inactivate the CD36 receptor, which is a receptor found on many immune and endothelial cells. CD36 initiates and perpetuates the immune system's inflammatory responses. By inactivating this receptor temporarily, inflammation and immune cell entry could be lessened, and therefore secondary injury could be attenuated. This project utilized phage display as a method of nanobody selection. The specific phage library utilized in this experiment consists of human heavy chain (V_H) segments, also known as domain antibodies (dAbs), displayed on M13 filamentous bacteriophage. Phage display mimics the process of immune selection. The target is bound to a well as a means of displaying it to the phage. The phage library is then incubated with the target to allow antibodies to bind. After, the well is washed thoroughly to detach any phage that are not strongly bound. The remaining phage are then amplified in bacteria and run again through the same assay to select for mutations that resulted in higher affinity binding. This process, called biopanning, was performed three times for this project. After biopanning, the library was sequenced using Next Generation sequencing (NGS). This platform enables the entire library to be sequenced, as opposed to traditional Sanger sequencing, which can only sequence single select clones at a time thereby limiting population sampling. This type of genetic sequencing allows trends in the complementarity determining regions (CDRs) of the domain antibody library to be analyzed, using bioinformatics programs such as RStudio, FastAptamer, and Swiss Model. Ultimately, two nanobody candidates were identified for the CD36 receptor.

Contributors

Agent

Created

Date Created
2018-05

133551-Thumbnail Image.png

ConstrictR and ConstrictPy: R Package and Python Tool for Microbiome Analysis

Description

I, Christopher Negrich, am the sole author of this paper, but the tools described were designed in collaboration with Andrew Hoetker. ConstrictR (constrictor) and ConstrictPy are an R package and python tool designed together. ConstrictPy implements the functions and methods

I, Christopher Negrich, am the sole author of this paper, but the tools described were designed in collaboration with Andrew Hoetker. ConstrictR (constrictor) and ConstrictPy are an R package and python tool designed together. ConstrictPy implements the functions and methods defined in ConstrictR and applies data handling, data parsing, input/output (I/O), and a user interface to increase usability. ConstrictR implements a variety of common data analysis methods used for statistical and subnetwork analysis. The majority of these methods are inspired by Lionel Guidi's 2016 paper, Plankton networks driving carbon export in the oligotrophic ocean. Additional methods were added to expand functionality, usability, and applicability to different areas of data science. Both ConstrictR and ConstrictPy are currently publicly available and usable, however, they are both ongoing projects. ConstrictR is available at github.com/cnegrich and ConstrictPy is available at github.com/ahoetker. Currently, ConstrictR has implemented functions for descriptive statistics, correlation, covariance, rank, sparsity, and weighted correlation network analysis with clustering, centrality, profiling, error handling, and data parsing methods to be released soon. ConstrictPy has fully implemented and integrated the features in ConstrictR as well as created functions for I/O and conversion between pandas and R data frames with a full feature user interface to be released soon. Both ConstrictR and ConstrictPy are designed to work with minimal dependencies and maximum available information on the algorithms implemented. As a result, ConstrictR is only dependent on base R (v3.4.4) functions with no libraries imported. ConstrictPy is dependent upon only pandas, Rpy2, and ConstrictR. This was done to increase longevity and independence of these tools. Additionally, all mathematical information is documented alongside the code, increasing the available information on how these tools function. Although neither tool is in its final version, this paper documents the code, mathematics, and instructions for use, in addition to plans for future work, for of the current versions of ConstrictR (v0.0.1) and ConstrictPy (v0.0.1).

Contributors

Agent

Created

Date Created
2018-05

134629-Thumbnail Image.png

Improving the Valley Fever Gene Annotation Through Proteogenomic Analysis

Description

Valley Fever, also known as coccidioidomycosis, is a respiratory disease that affects 10,000 people annually, primarily in Arizona and California. Due to a lack of gene annotation, diagnosis and treatment of Valley Fever is severely limited. In turn, gene annotation

Valley Fever, also known as coccidioidomycosis, is a respiratory disease that affects 10,000 people annually, primarily in Arizona and California. Due to a lack of gene annotation, diagnosis and treatment of Valley Fever is severely limited. In turn, gene annotation efforts are also hampered by incomplete genome sequencing. We intend to use proteogenomic analysis to reannotate the Coccidioides posadasii str. Silveira genome from protein-level data. Protein samples extracted from both phases of Silveira were fragmented into peptides, sequenced, and compared against databases of known and predicted proteins sequences, as well as a de novo six-frame translation of the genome. 288 unique peptides were located that did not match a known Silveira annotation, and of those 169 were associated with another Coccidioides strain. Additionally, 17 peptides were found at the boundary of, or outside of, the current gene annotation comprising four distinct clusters. For one of these clusters, we were able to calculate a lower bound and an estimate for the size of the gap between two Silveira contigs using the Coccidioides immitis RS transcript associated with that cluster's peptides \u2014 these predictions were consistent with the current annotation's scaffold structure. Three peptides were associated with an actively translated transposon, and a putative active site was located within an intact LTR retrotransposon. We note that gene annotation is necessarily hindered by the quality and level of detail in prior genome sequencing efforts, and recommend that future studies involving reannotation include additional sequencing as well as gene annotation via proteogenomics or other methods.

Contributors

Agent

Created

Date Created
2016-12

134524-Thumbnail Image.png

An Analysis of the Benchmark Test lzbench for Open-Source Compressors

Description

With the rising data output and falling costs of Next Generation Sequencing technologies, research into data compression is crucial to maintaining storage efficiency and costs. High throughput sequencers such as the HiSeqX Ten can produce up to 1.8 terabases of

With the rising data output and falling costs of Next Generation Sequencing technologies, research into data compression is crucial to maintaining storage efficiency and costs. High throughput sequencers such as the HiSeqX Ten can produce up to 1.8 terabases of data per run, and such large storage demands are even more important to consider for institutions that rely on their own servers rather than large data centers (cloud storage)1. Compression algorithms aim to reduce the amount of space taken up by large genomic datasets by encoding the most frequently occurring symbols with the shortest bit codewords and by changing the order of the data to make it easier to encode. Depending on the probability distribution of the symbols in the dataset or the structure of the data, choosing the wrong algorithm could result in a compressed file larger than the original or a poorly compressed file that results in a waste of time and space2. To test efficiency among compression algorithms for each file type, 37 open-source compression algorithms were used to compress six types of genomic datasets (FASTA, VCF, BCF, GFF, GTF, and SAM) and evaluated on compression speed, decompression speed, compression ratio, and file size using the benchmark test lzbench. Compressors that outpreformed the popular bioinformatics compressor Gzip (zlib -6) were evaluated against one another by ratio and speed for each file type and across the geometric means of all file types. Compressors that exhibited fast compression and decompression speeds were also evaluated by transmission time through variable speed internet pipes in scenarios where the file was compressed only once or compressed multiple times.

Contributors

Agent

Created

Date Created
2017-05

134237-Thumbnail Image.png

Neoantigen Prediction Pipeline

Description

Cells become cancerous due to changes in their genetic makeup. In cancers, an altered amino acid due to a tumor mutation can result in proteins that are identified as "foreign" by the immune system. An MHC molecule will bind to

Cells become cancerous due to changes in their genetic makeup. In cancers, an altered amino acid due to a tumor mutation can result in proteins that are identified as "foreign" by the immune system. An MHC molecule will bind to these "foreign" peptide fragments, also called neoantigens. There are 2 classes of MHC molecules. While the MHC I complex is found in all cells with a nucleus, MHC II complexes are mostly found in antigen presenting cells (APCs), such as macrophages, B cells, and dendritic cells. The MHC molecule then presents the neoantigen on the cell's surface. If an immune cell, such as a T-cell, is able to bind to the neoantigen, it can then destroy the tumor cell. However, there are molecules that act as checkpoints on certain immune cells that have to be activated or inactivated to start an immune response. This ensures that healthy cells are not being killed. However, sometimes cancer cells can find ways to use these checkpoints to avoid being attacked. An example of immunotherapy which has had clinical successes is checkpoint blockade inhibition, which means blocking the activity of immune checkpoint proteins in order to release the "brakes" on the immune system to increase its ability to destroy cancer cells. Studies have found that there is a correlation between mutational load and response to immunotherapy. The goal of this project is to create a pipeline that identifies tumor neoantigens. This involved researching various softwares and implementing them to work together. This project involved developing a neoantigen prediction pipeline, which works with TGen's genomics pipeline, to help understand a patient's immune response. The neoantigen prediction pipeline first creates two protein fastas from the high quality non-synonymous mutations, frameshifts, codon insertions, and codon deletions from vcfmerger. One of the protein fastas includes the mutations, while the other one does not representing the wildtype protein. The pipeline then predicts both classes of HLA genotypes of the MHC molecules using DNA or RNA expression in the form of fastqs. The protein fastas and each HLA are fed into IEDB to obtain peptide-MHC binding predictions. Wildtype peptides and neoantigens with low binding affinities are then removed. RNA expression information is then added into the final text file from dseq and sailfish files from TGen's genomics pipeline.

Contributors

Agent

Created

Date Created
2017-05

132980-Thumbnail Image.png

Beginning to investigate Lactase Persistence in Turkana

Description

Lactase persistence is the ability of adults to digest lactose in milk (Segurel & Bon, 2017). Mammals are generally distinguished by their mammary glands which gives females the ability to produce milk and feed their newborn children. The new born

Lactase persistence is the ability of adults to digest lactose in milk (Segurel & Bon, 2017). Mammals are generally distinguished by their mammary glands which gives females the ability to produce milk and feed their newborn children. The new born therefore requires the ability to breakdown the lactose in the milk to ensure its proper digestion (Segurel & Bon, 2017). Generally, humans lose the expression of lactase after weaning, which prevents them being able to breakdown lactose from dairy (Flatz, 1987).
My research is focused on the people of Turkana, a human pastoral population inhabiting Northwest Kenya. The people of Turkana are Nilotic people that are native to the Turkana district. There are currently no conclusive studies done on evidence for genetic lactase persistence in Turkana. Therefore, my research will be on the evolution of lactase persistence in the people of Turkana. The goal of this project is to investigate the evolutionary history of two genes with known involvement in lactase persistence, LCT and MCM6, in the Turkana. Variants in these genes have previously been identified to result in the ability to digest lactose post-weaning age. Furthermore, an additional study found that a closely related population to the Turkana, the Massai, showed stronger signals of recent selection for lactase persistence than Europeans in these genes. My goal is to characterize known variants associated with lactase persistence by calculating their allele frequencies in the Turkana and conduct selection scans to determine if LCT/MCM6 show signatures of positive selection. In doing this, we conducted a pilot study consisting of 10 female Turkana individuals and 10 females from four different populations from the 1000 genomes project namely: the Yoruba in Ibadan, Nigeria (YRI); Luhya in Webuye, Kenya; Utah Residents with Northern and Western European Ancestry (CEU); and the Southern Han Chinese. The allele frequency calculation suggested that the CEU (Utah Residents with Northern and Western European Ancestry) population had a higher lactase persistence associated allele frequency than all the other populations analyzed here, including the Turkana population. Our Tajima’s D calculations and analysis suggested that both the Turkana population and the four haplotype map populations shows signatures of positive selection in the same region. The iHS selection scans we conducted to detect signatures of positive selection on all five populations showed that the Southern Han Chinese (CHS), the LWK (Luhya in Webuye, Kenya) and the YRI (Yoruba in Ibadan, Nigeria) populations had stronger signatures of positive selection than the Turkana population. The LWK (Luhya in Webuye, Kenya) and the YRI (Yoruba in Ibadan, Nigeria) populations showed the strongest signatures of positive selection in this region. This project serves as a first step in the investigation of lactase persistence in the Turkana population and its evolution over time.

Contributors

Agent

Created

Date Created
2019-05

135359-Thumbnail Image.png

Utilizing MRI Texture Analysis and APOE Genotype to Predict the Aging Brain as a Potential Method for Early Assessment of Alzheimer's Disease

Description

Background: Noninvasive MRI methods that can accurately detect subtle brain changes are highly desirable when studying disease-modifying interventions. Texture analysis is a novel imaging technique which utilizes the extraction of a large number of image features with high specificity and

Background: Noninvasive MRI methods that can accurately detect subtle brain changes are highly desirable when studying disease-modifying interventions. Texture analysis is a novel imaging technique which utilizes the extraction of a large number of image features with high specificity and predictive power. In this investigation, we use texture analysis to assess and classify age-related changes in the right and left hippocampal regions, the areas known to show some of the earliest change in Alzheimer's disease (AD). Apolipoprotein E (APOE)'s e4 allele confers an increased risk for AD, so studying differences in APOE e4 carriers may help to ascertain subtle brain changes before there has been an obvious change in behavior. We examined texture analysis measures that predict age-related changes, which reflect atrophy in a group of cognitively normal individuals. We hypothesized that the APOE e4 carriers would exhibit significant age-related differences in texture features compared to non-carriers, so that the predictive texture features hold promise for early assessment of AD. Methods: 120 normal adults between the ages of 32 and 90 were recruited for this neuroimaging study from a larger parent study at Mayo Clinic Arizona studying longitudinal cognitive functioning (Caselli et al., 2009). As part of the parent study, the participants were genotyped for APOE genetic polymorphisms and received comprehensive cognitive testing every two years, on average. Neuroimaging was done at Barrow Neurological Institute and a 3D T1-weighted magnetic resonance image was obtained during scanning that allowed for subsequent texture analysis processing. Voxel-based features of the appearance, structure, and arrangement of these regions of interest were extracted utilizing the Mayo Clinic Python Texture Analysis Pipeline (pyTAP). Algorithms applied in feature extraction included Grey-Level Co-Occurrence Matrix (GLCM), Gabor Filter Banks (GFB), Local Binary Patterns (LBP), Discrete Orthogonal Stockwell Transform (DOST), and Laplacian-of-Gaussian Histograms (LoGH). Principal component (PC) analysis was used to reduce the dimensionality of the algorithmically selected features to 13 PCs. A stepwise forward regression model was used to determine the effect of APOE status (APOE e4 carriers vs. noncarriers), and the texture feature principal components on age (as a continuous variable). After identification of 5 significant predictors of age in the model, the individual feature coefficients of those principal components were examined to determine which features contributed most significantly to the prediction of an aging brain. Results: 70 texture features were extracted for the two regions of interest in each participant's scan. The texture features were coded as 70 initial components andwere rotated to generate 13 principal components (PC) that contributed 75% of the variance in the dataset by scree plot analysis. The forward stepwise regression model used in this exploratory study significantly predicted age, accounting for approximately 40% of the variance in the data. The regression model revealed 5 significant regressors (2 right PC's, APOE status, and 2 left PC by APOE interactions). Finally, the specific texture features that contributed to each significant PCs were identified. Conclusion: Analysis of image texture features resulted in a statistical model that was able to detect subtle changes in brain integrity associated with age in a group of participants who are cognitively normal, but have an increased risk of developing AD based on the presence of the APOE e4 phenotype. This is an important finding, given that detecting subtle changes in regions vulnerable to the effects of AD in patients could allow certain texture features to serve as noninvasive, sensitive biomarkers predictive of AD. Even with only a small number of patients, the ability for us to determine sensitive imaging biomarkers could facilitate great improvement in speed of detection and effectiveness of AD interventions..

Contributors

Agent

Created

Date Created
2016-05

135454-Thumbnail Image.png

Identifying Variation Within Substitution Rates in Mammary Gland Development Genes within Primate Genomes

Description

Mammary gland development in humans during puberty involves the enlargement of breast tissue, but this is not true in non-human primates. To identify potential causes of this difference, I examined variation in substitution rates across genes related to mammary development.

Mammary gland development in humans during puberty involves the enlargement of breast tissue, but this is not true in non-human primates. To identify potential causes of this difference, I examined variation in substitution rates across genes related to mammary development. Genes undergoing purifying selection show slower-than-average substitution rates, while genes undergoing positive selection show faster rates. These may be related to the difference between humans and other primates. Three genes were found to be accelerated were FOXF1, IGFBP5, and ATP2B2, but only the latter one was found in humans and it seems unlikely that it would be related to the differences between mammary gland development at puberty between humans and non-human primates.

Contributors

Agent

Created

Date Created
2016-05

136684-Thumbnail Image.png

miRNA Targeting: In depth review of biologically significant mechanisms and a bioinformatic approach to identifying targeting sequences in C. elegans

Description

microRNAs (miRNAs) are short ~22nt non-coding RNAs that regulate gene output at the post-transcriptional level. Via targeting of degenerate elements primarily in 3'untranslated regions (3'UTR) of mRNAs, miRNAs can target thousands of varying genes and suppress their protein translation. The

microRNAs (miRNAs) are short ~22nt non-coding RNAs that regulate gene output at the post-transcriptional level. Via targeting of degenerate elements primarily in 3'untranslated regions (3'UTR) of mRNAs, miRNAs can target thousands of varying genes and suppress their protein translation. The precise mechanistic function and bio- logical role of miRNAs is not fully understood and yet it is a major contributor to a pleth- ora of diseases, including neurological disorders, muscular disorders, and cancer. Cer- tain model organisms are valuable in understanding the function of miRNA and there- fore fully understanding the biological significance of miRNA targeting. Here I report a mechanistic analysis of miRNA targeting in C. elegans, and a bioinformatic approach to aid in further investigation of miRNA targeted sequences. A few of the biologically significant mechanisms discussed in this thesis include alternative polyadenylation, RNA binding proteins, components of the miRNA recognition machinery, miRNA secondary structures, and their polymorphisms. This thesis also discusses a novel bioinformatic approach to studying miRNA biology, including computational miRNA target prediction software, and sequence complementarity. This thesis allows a better understanding of miRNA biology and presents an ideal strategy for approaching future research in miRNA targeting.

Contributors

Agent

Created

Date Created
2014-12

137143-Thumbnail Image.png

Preliminary Metabolic Reconstruction of Two Methane Producing Microbes: Methanoregula boonei 6A8 and Methanosphaerula palustris E1-9c

Description

Methane (CH4) is very important in the environment as it is a greenhouse gas and important for the degradation of organic matter. During the last 200 years the atmospheric concentration of CH4 has tripled. Methanogens are methane-producing microbes from the

Methane (CH4) is very important in the environment as it is a greenhouse gas and important for the degradation of organic matter. During the last 200 years the atmospheric concentration of CH4 has tripled. Methanogens are methane-producing microbes from the Archaea domain that complete the final step in breaking down organic matter to generate methane through a process called methanogenesis. They contribute to about 74% of the CH4 present on the Earth's atmosphere, producing 1 billion tons of methane annually. The purpose of this work is to generate a preliminary metabolic reconstruction model of two methanogens: Methanoregula boonei 6A8 and Methanosphaerula palustris E1-9c. M. boonei and M. palustris are part of the Methanomicrobiales order and perform hydrogenotrophic methanogenesis, which means that they reduce CO2 to CH4 by using H2 as their major electron donor. Metabolic models are frameworks for understanding a cell as a system and they provide the means to assess the changes in gene regulation in response in various environmental and physiological constraints. The Pathway-Tools software v16 was used to generate these draft models. The models were manually curated using literature searches, the KEGG database and homology methods with the Methanosarcina acetivorans strain, the closest methanogen strain with a nearly complete metabolic reconstruction. These preliminary models attempt to complete the pathways required for amino acid biosynthesis, methanogenesis, and major cofactors related to methanogenesis. The M. boonei reconstruction currently includes 99 pathways and has 82% of its reactions completed, while the M. palustris reconstruction includes 102 pathways and has 89% of its reactions completed.

Contributors

Agent

Created

Date Created
2014-05