Matching Items (4)
Filtering by

Clear all filters

134524-Thumbnail Image.png
Description
With the rising data output and falling costs of Next Generation Sequencing technologies, research into data compression is crucial to maintaining storage efficiency and costs. High throughput sequencers such as the HiSeqX Ten can produce up to 1.8 terabases of data per run, and such large storage demands are even

With the rising data output and falling costs of Next Generation Sequencing technologies, research into data compression is crucial to maintaining storage efficiency and costs. High throughput sequencers such as the HiSeqX Ten can produce up to 1.8 terabases of data per run, and such large storage demands are even more important to consider for institutions that rely on their own servers rather than large data centers (cloud storage)1. Compression algorithms aim to reduce the amount of space taken up by large genomic datasets by encoding the most frequently occurring symbols with the shortest bit codewords and by changing the order of the data to make it easier to encode. Depending on the probability distribution of the symbols in the dataset or the structure of the data, choosing the wrong algorithm could result in a compressed file larger than the original or a poorly compressed file that results in a waste of time and space2. To test efficiency among compression algorithms for each file type, 37 open-source compression algorithms were used to compress six types of genomic datasets (FASTA, VCF, BCF, GFF, GTF, and SAM) and evaluated on compression speed, decompression speed, compression ratio, and file size using the benchmark test lzbench. Compressors that outpreformed the popular bioinformatics compressor Gzip (zlib -6) were evaluated against one another by ratio and speed for each file type and across the geometric means of all file types. Compressors that exhibited fast compression and decompression speeds were also evaluated by transmission time through variable speed internet pipes in scenarios where the file was compressed only once or compressed multiple times.
ContributorsHowell, Abigail (Author) / Cartwright, Reed (Thesis director) / Wilson Sayres, Melissa (Committee member) / Taylor, Jay (Committee member) / Barrett, The Honors College (Contributor)
Created2017-05
135454-Thumbnail Image.png
Description
Mammary gland development in humans during puberty involves the enlargement of breast tissue, but this is not true in non-human primates. To identify potential causes of this difference, I examined variation in substitution rates across genes related to mammary development. Genes undergoing purifying selection show slower-than-average substitution rates, while genes

Mammary gland development in humans during puberty involves the enlargement of breast tissue, but this is not true in non-human primates. To identify potential causes of this difference, I examined variation in substitution rates across genes related to mammary development. Genes undergoing purifying selection show slower-than-average substitution rates, while genes undergoing positive selection show faster rates. These may be related to the difference between humans and other primates. Three genes were found to be accelerated were FOXF1, IGFBP5, and ATP2B2, but only the latter one was found in humans and it seems unlikely that it would be related to the differences between mammary gland development at puberty between humans and non-human primates.
ContributorsArroyo, Diana (Author) / Cartwright, Reed (Thesis director) / Wilson Sayres, Melissa (Committee member) / Schwartz, Rachel (Committee member) / School of Life Sciences (Contributor) / Barrett, The Honors College (Contributor)
Created2016-05
154953-Thumbnail Image.png
Description
Intervertebral Disc Degeneration (IVDD) is a complex phenomenon characterizing the desiccation and structural compromise of the primary joint in the human spine. The intervertebral disc (IVD) serves to connect vertebral bodies, cushion shock, and allow for flexion and extension of the vertebral column. Often presenting in the 4th or 5th

Intervertebral Disc Degeneration (IVDD) is a complex phenomenon characterizing the desiccation and structural compromise of the primary joint in the human spine. The intervertebral disc (IVD) serves to connect vertebral bodies, cushion shock, and allow for flexion and extension of the vertebral column. Often presenting in the 4th or 5th decades of life as low back pain, this disease was originally believed to be the result of natural “wear and tear” coupled with repetitive mechanical insult, and as such most studies focus on patients between 40 and 50 years of age. Research over the past two decades, however, has demonstrated that environmental factors have only a modest effect on disc degeneration, with genetic influences playing a much more substantial role. Extensive research has focused on this process, though definitive risk factors and a clear pathophysiology have proven elusive. The aim of this study was to assemble a cohort of patients exhibiting definitive signs of degeneration who were well below the average age of presentation, with minimal or no exposure to suspected environmental risk factors and to conduct a targeted genome analysis in an attempt to elucidate a common genetic component. Through whole genome sequencing and analysis, the results corroborated findings in a previous study, as well as demonstrated a potential connection and influence between mutations found in IVD structural or functional genes, and the provocation of IVDD. Though the sample size was limited in scale and age, these findings suggest that further IVDD research into the association of variants in collagen, aggrecan and the insulin-like growth factor receptor genes of young patients with an early presentation of disc degeneration and minimal exposure to suspected risk factors is merited.
ContributorsFulton, Travis (Author) / Liebig, Juergen (Thesis advisor) / Neisewander, Janet (Committee member) / Theodore, Nicholas (Committee member) / Arizona State University (Publisher)
Created2016
161497-Thumbnail Image.png
Description
The Pathways of Distinction Analysis (PoDA) program calculates relationships between a given group of genes contained within a pathway, and a disease state. It was used here to investigate liver cancer, and to explore how genetic variability may contribute to the different rates of development of the disease in males

The Pathways of Distinction Analysis (PoDA) program calculates relationships between a given group of genes contained within a pathway, and a disease state. It was used here to investigate liver cancer, and to explore how genetic variability may contribute to the different rates of development of the disease in males and females. The goal of the study was to identify germline variation that differs by sex in hepatocellular carcinoma. Using the program, multiple pathways and genes were identified to have significant differences in their relationship to liver cancer in males and females. In animal studies, the genes which were identified using the PoDA analysis have been shown to impact liver cancer, often with different results for males and females. While these genes are often the focus in animal models, they are absent from current Genome Wide Association Studies (GWAS) catalogs for humans. By working to bridge the results of animal studies and human studies, the results help to identify the causes of liver cancer, and more specifically, the reason the disease affects males at much higher rates. The differences in pathways identified to be significant for the two sexes indicate the germline variance may play sex-specific roles in the development of hepatocellular carcinoma. Additionally, these results reinforce the capacity of the PoDA analysis to identify genes that may be missed by more traditional GWAS methods. This study lays the groundwork for further investigations into the identified genes and pathways, and how they behave differently within males and females.
ContributorsOlson, Erik Jon (Author) / Buetow, Kenneth (Thesis advisor) / Wilson, Melissa (Committee member) / Cartwright, Reed (Committee member) / Arizona State University (Publisher)
Created2021