Matching Items (11)
Filtering by

Clear all filters

152165-Thumbnail Image.png
Description
Surgery as a profession requires significant training to improve both clinical decision making and psychomotor proficiency. In the medical knowledge domain, tools have been developed, validated, and accepted for evaluation of surgeons' competencies. However, assessment of the psychomotor skills still relies on the Halstedian model of apprenticeship, wherein surgeons are

Surgery as a profession requires significant training to improve both clinical decision making and psychomotor proficiency. In the medical knowledge domain, tools have been developed, validated, and accepted for evaluation of surgeons' competencies. However, assessment of the psychomotor skills still relies on the Halstedian model of apprenticeship, wherein surgeons are observed during residency for judgment of their skills. Although the value of this method of skills assessment cannot be ignored, novel methodologies of objective skills assessment need to be designed, developed, and evaluated that augment the traditional approach. Several sensor-based systems have been developed to measure a user's skill quantitatively, but use of sensors could interfere with skill execution and thus limit the potential for evaluating real-life surgery. However, having a method to judge skills automatically in real-life conditions should be the ultimate goal, since only with such features that a system would be widely adopted. This research proposes a novel video-based approach for observing surgeons' hand and surgical tool movements in minimally invasive surgical training exercises as well as during laparoscopic surgery. Because our system does not require surgeons to wear special sensors, it has the distinct advantage over alternatives of offering skills assessment in both learning and real-life environments. The system automatically detects major skill-measuring features from surgical task videos using a computing system composed of a series of computer vision algorithms and provides on-screen real-time performance feedback for more efficient skill learning. Finally, the machine-learning approach is used to develop an observer-independent composite scoring model through objective and quantitative measurement of surgical skills. To increase effectiveness and usability of the developed system, it is integrated with a cloud-based tool, which automatically assesses surgical videos upload to the cloud.
ContributorsIslam, Gazi (Author) / Li, Baoxin (Thesis advisor) / Liang, Jianming (Thesis advisor) / Dinu, Valentin (Committee member) / Greenes, Robert (Committee member) / Smith, Marshall (Committee member) / Kahol, Kanav (Committee member) / Patel, Vimla L. (Committee member) / Arizona State University (Publisher)
Created2013
157106-Thumbnail Image.png
Description
In most diploid cells, autosomal genes are equally expressed from the paternal and maternal alleles resulting in biallelic expression. However, as an exception, there exists a small number of genes that show a pattern of monoallelic or biased-allele expression based on the allele’s parent-of-origin. This phenomenon is termed genomic imprinting

In most diploid cells, autosomal genes are equally expressed from the paternal and maternal alleles resulting in biallelic expression. However, as an exception, there exists a small number of genes that show a pattern of monoallelic or biased-allele expression based on the allele’s parent-of-origin. This phenomenon is termed genomic imprinting and is an evolutionary paradox. The best explanation for imprinting is David Haig's kinship theory, which hypothesizes that monoallelic gene expression is largely the result of evolutionary conflict between males and females over maternal involvement in their offspring. One previous RNAseq study has investigated the presence of parent-of-origin effects, or imprinting, in the parasitic jewel wasp Nasonia vitripennis (N. vitripennis) and its sister species Nasonia giraulti (N. giraulti) to test the predictions of kinship theory in a non-eusocial species for comparison to a eusocial one. In order to continue to tease apart the connection between social and eusocial Hymenoptera, this study proposed a similar RNAseq study that attempted to reproduce these results in unique samples of reciprocal F1 Nasonia hybrids. Building a pseudo N. giraulti reference genome, differences were observed when aligning RNAseq reads to a N. vitripennis reference genome compared to aligning reads to a pseudo N. giraulti reference. As well, no evidence for parent-of-origin or imprinting patterns in adult Nasonia were found. These results demonstrated a species-of-origin effect. Importantly, the study continued to build a repository of support with the aim to elucidate the mechanisms behind imprinting in an excellent epigenetic model species, as it can also help with understanding the phenomenon of imprinting in complex human diseases.
ContributorsUnderwood, Avery Elizabeth (Author) / Wilson, Melissa (Thesis advisor) / Buetow, Kenneth (Committee member) / Gile, Gillian (Committee member) / Arizona State University (Publisher)
Created2019
132823-Thumbnail Image.png
Description
Schizophrenia is a disease that affects 15.2/100,000 US citizens, with about 0.6-1.9% of the total population being afflicted with some range of severity of the disease. A lot of research has been done on the progression of the disease and its differences between males and females; however, the true underlying

Schizophrenia is a disease that affects 15.2/100,000 US citizens, with about 0.6-1.9% of the total population being afflicted with some range of severity of the disease. A lot of research has been done on the progression of the disease and its differences between males and females; however, the true underlying cause of the disease remains unknown. In the literature, however, there is a lot of indication that a genetic cause for schizophrenia is the primary origin for the disorder. In order to establish a foundation in differential gene expression and isoform expression between males and females, we utilized the Genotype-Tissue Expression Project data set (which contains samples from healthy individuals at their time of death) for the amygdala, anterior cingulate cortex, and frontal cortex. We performed quality control on the data with Trimmomatic and visualized it with FastQC and MultiQC. We then aligned to a sex-specific reference genome with Hisat2. Finally, we performed a differential expression analysis dthrough the limma/voom package with inputs from featureCounts. An isoform level analysis was run on the anterior cingulate cortex with the IsoformSwitchAnalyzeR package. We were able to identify a few differentially expressed genes in the three tissue sites, which included XIST and other highly conserved, Y-linked genes. As for the isoform level analysis, we were able to identify 13 genes with significant levels of differential isoform usage and expression, two of which have clinical relevance (DAB1 and PACRG). These findings will allow for a comparison to be made by future studies on gene expression in brain tissue samples from patients that had been diagnosed with schizophrenia in their life. By identifying any unique genes in these patients, gene therapies can be developed to target and correct any misexpression that may be occurring.
ContributorsEvanovich, Austin Phillip (Author) / Wilson, Melissa (Thesis director) / Buetow, Kenneth (Committee member) / Natri, Heini Maaret (Committee member) / School of Life Sciences (Contributor, Contributor) / Barrett, The Honors College (Contributor)
Created2019-05
154999-Thumbnail Image.png
Description
Social media is becoming increasingly popular as a platform for sharing personal health-related information. This information can be utilized for public health monitoring tasks such as pharmacovigilance via the use of Natural Language Processing (NLP) techniques. One of the critical steps in information extraction pipelines is Named Entity Recognition

Social media is becoming increasingly popular as a platform for sharing personal health-related information. This information can be utilized for public health monitoring tasks such as pharmacovigilance via the use of Natural Language Processing (NLP) techniques. One of the critical steps in information extraction pipelines is Named Entity Recognition (NER), where the mentions of entities such as diseases are located in text and their entity type are identified. However, the language in social media is highly informal, and user-expressed health-related concepts are often non-technical, descriptive, and challenging to extract. There has been limited progress in addressing these challenges, and advanced machine learning-based NLP techniques have been underutilized. This work explores the effectiveness of different machine learning techniques, and particularly deep learning, to address the challenges associated with extraction of health-related concepts from social media. Deep learning has recently attracted a lot of attention in machine learning research and has shown remarkable success in several applications particularly imaging and speech recognition. However, thus far, deep learning techniques are relatively unexplored for biomedical text mining and, in particular, this is the first attempt in applying deep learning for health information extraction from social media.

This work presents ADRMine that uses a Conditional Random Field (CRF) sequence tagger for extraction of complex health-related concepts. It utilizes a large volume of unlabeled user posts for automatic learning of embedding cluster features, a novel application of deep learning in modeling the similarity between the tokens. ADRMine significantly improved the medical NER performance compared to the baseline systems.

This work also presents DeepHealthMiner, a deep learning pipeline for health-related concept extraction. Most of the machine learning methods require sophisticated task-specific manual feature design which is a challenging step in processing the informal and noisy content of social media. DeepHealthMiner automatically learns classification features using neural networks and utilizing a large volume of unlabeled user posts. Using a relatively small labeled training set, DeepHealthMiner could accurately identify most of the concepts, including the consumer expressions that were not observed in the training data or in the standard medical lexicons outperforming the state-of-the-art baseline techniques.
ContributorsNikfarjam, Azadeh (Author) / Gonzalez, Graciela (Thesis advisor) / Greenes, Robert (Committee member) / Scotch, Matthew (Committee member) / Arizona State University (Publisher)
Created2016
157992-Thumbnail Image.png
Description
Unstructured texts containing biomedical information from sources such as electronic health records, scientific literature, discussion forums, and social media offer an opportunity to extract information for a wide range of applications in biomedical informatics. Building scalable and efficient pipelines for natural language processing and extraction of biomedical information plays an

Unstructured texts containing biomedical information from sources such as electronic health records, scientific literature, discussion forums, and social media offer an opportunity to extract information for a wide range of applications in biomedical informatics. Building scalable and efficient pipelines for natural language processing and extraction of biomedical information plays an important role in the implementation and adoption of applications in areas such as public health. Advancements in machine learning and deep learning techniques have enabled rapid development of such pipelines. This dissertation presents entity extraction pipelines for two public health applications: virus phylogeography and pharmacovigilance. For virus phylogeography, geographical locations are extracted from biomedical scientific texts for metadata enrichment in the GenBank database containing 2.9 million virus nucleotide sequences. For pharmacovigilance, tools are developed to extract adverse drug reactions from social media posts to open avenues for post-market drug surveillance from non-traditional sources. Across these pipelines, high variance is observed in extraction performance among the entities of interest while using state-of-the-art neural network architectures. To explain the variation, linguistic measures are proposed to serve as indicators for entity extraction performance and to provide deeper insight into the domain complexity and the challenges associated with entity extraction. For both the phylogeography and pharmacovigilance pipelines presented in this work the annotated datasets and applications are open source and freely available to the public to foster further research in public health.
ContributorsMagge, Arjun (Author) / Scotch, Matthew (Thesis advisor) / Gonzalez-Hernandez, Graciela (Thesis advisor) / Greenes, Robert (Committee member) / Arizona State University (Publisher)
Created2019
158849-Thumbnail Image.png
Description
Next-generation sequencing is a powerful tool for detecting genetic variation. How-ever, it is also error-prone, with error rates that are much larger than mutation rates.
This can make mutation detection difficult; and while increasing sequencing depth
can often help, sequence-specific errors and other non-random biases cannot be de-
tected by increased depth. The

Next-generation sequencing is a powerful tool for detecting genetic variation. How-ever, it is also error-prone, with error rates that are much larger than mutation rates.
This can make mutation detection difficult; and while increasing sequencing depth
can often help, sequence-specific errors and other non-random biases cannot be de-
tected by increased depth. The problem of accurate genotyping is exacerbated when
there is not a reference genome or other auxiliary information available.
I explore several methods for sensitively detecting mutations in non-model or-
ganisms using an example Eucalyptus melliodora individual. I use the structure of
the tree to find bounds on its somatic mutation rate and evaluate several algorithms
for variant calling. I find that conventional methods are suitable if the genome of a
close relative can be adapted to the study organism. However, with structured data,
a likelihood framework that is aware of this structure is more accurate. I use the
techniques developed here to evaluate a reference-free variant calling algorithm.
I also use this data to evaluate a k-mer based base quality score recalibrator
(KBBQ), a tool I developed to recalibrate base quality scores attached to sequencing
data. Base quality scores can help detect errors in sequencing reads, but are often
inaccurate. The most popular method for correcting this issue requires a known
set of variant sites, which is unavailable in most cases. I simulate data and show
that errors in this set of variant sites can cause calibration errors. I then show that
KBBQ accurately recalibrates base quality scores while requiring no reference or other
information and performs as well as other methods.
Finally, I use the Eucalyptus data to investigate the impact of quality score calibra-
tion on the quality of output variant calls and show that improved base quality score
calibration increases the sensitivity and reduces the false positive rate of a variant
calling algorithm.
ContributorsOrr, Adam James (Author) / Cartwright, Reed (Thesis advisor) / Wilson, Melissa (Committee member) / Kusumi, Kenro (Committee member) / Taylor, Jesse (Committee member) / Pfeifer, Susanne (Committee member) / Arizona State University (Publisher)
Created2020
161529-Thumbnail Image.png
Description
Hepatocellular carcinoma (HCC) is the third leading cause of cancer death worldwide and exhibits a male-bias in occurrence and mortality. Previous studies have provided insight into the role of inherited genetic regulation of transcription in modulating sex-differences in HCC etiology and mortality. This study uses pathway analysis to add insight

Hepatocellular carcinoma (HCC) is the third leading cause of cancer death worldwide and exhibits a male-bias in occurrence and mortality. Previous studies have provided insight into the role of inherited genetic regulation of transcription in modulating sex-differences in HCC etiology and mortality. This study uses pathway analysis to add insight into the biological processes that drive sex-differences in HCC etiology as well as a provide additional framework for future studies on sex-biased cancers. Gene expression data from normal, tumor adjacent, and HCC liver tissue were used to calculate pathway scores using a tool called PathOlogist that not only takes into consideration the molecules in a biological pathway, but also the interaction type and directionality of the signaling pathways. Analysis of the pathway scores uncovered etiologically relevant pathways differentiating male and female HCC. In normal and tumor adjacent liver tissue, males showed higher activity of pathways related to translation factors and signaling. Females did not show higher activity of any pathways compared to males in normal and tumor adjacent liver tissue. Work suggest biologic processes that underlie sex-biases in HCC occurrence and mortality. Both males and females differed in the activation of pathways related apoptosis, cell cycle, signaling, and metabolism in HCC. These results identify clinically relevant pathways for future research and therapeutic targeting.
ContributorsRehling, Thomas E (Author) / Buetow, Kenneth (Thesis advisor) / Wilson, Melissa (Committee member) / Maley, Carlo (Committee member) / Arizona State University (Publisher)
Created2021
161497-Thumbnail Image.png
Description
The Pathways of Distinction Analysis (PoDA) program calculates relationships between a given group of genes contained within a pathway, and a disease state. It was used here to investigate liver cancer, and to explore how genetic variability may contribute to the different rates of development of the disease in males

The Pathways of Distinction Analysis (PoDA) program calculates relationships between a given group of genes contained within a pathway, and a disease state. It was used here to investigate liver cancer, and to explore how genetic variability may contribute to the different rates of development of the disease in males and females. The goal of the study was to identify germline variation that differs by sex in hepatocellular carcinoma. Using the program, multiple pathways and genes were identified to have significant differences in their relationship to liver cancer in males and females. In animal studies, the genes which were identified using the PoDA analysis have been shown to impact liver cancer, often with different results for males and females. While these genes are often the focus in animal models, they are absent from current Genome Wide Association Studies (GWAS) catalogs for humans. By working to bridge the results of animal studies and human studies, the results help to identify the causes of liver cancer, and more specifically, the reason the disease affects males at much higher rates. The differences in pathways identified to be significant for the two sexes indicate the germline variance may play sex-specific roles in the development of hepatocellular carcinoma. Additionally, these results reinforce the capacity of the PoDA analysis to identify genes that may be missed by more traditional GWAS methods. This study lays the groundwork for further investigations into the identified genes and pathways, and how they behave differently within males and females.
ContributorsOlson, Erik Jon (Author) / Buetow, Kenneth (Thesis advisor) / Wilson, Melissa (Committee member) / Cartwright, Reed (Committee member) / Arizona State University (Publisher)
Created2021
131582-Thumbnail Image.png
Description
Analyzing human DNA sequence data allows researchers to identify variants associated with disease, reconstruct the demographic histories of human populations, and further understand the structure and function of the genome. Identifying variants in whole genome sequences is a crucial bioinformatics step in sequence data processing and can be performed using

Analyzing human DNA sequence data allows researchers to identify variants associated with disease, reconstruct the demographic histories of human populations, and further understand the structure and function of the genome. Identifying variants in whole genome sequences is a crucial bioinformatics step in sequence data processing and can be performed using multiple approaches. To investigate the consistency between different bioinformatics methods, we compared the accuracy and sensitivity of two genotyping strategies, joint variant calling and single-sample variant calling. Autosomal and sex chromosome variant call sets were produced by joint and single-sample calling variants for 10 female individuals. The accuracy of variant calls was assessed using SNP array genotype data collected from each individual. To compare the ability of joint and single-sample calling to capture low-frequency variants, folded site frequency spectra were constructed from variant call sets. To investigate the potential for these different variant calling methods to impact downstream analyses, we estimated nucleotide diversity for call sets produced using each approach. We found that while both methods were equally accurate when validated by SNP array sites, single-sample calling identified a greater number of singletons. However, estimates of nucleotide diversity were robust to these differences in the site frequency spectrum between call sets. Our results suggest that despite single-sample calling’s greater sensitivity for low-frequency variants, the differences between approaches have a minimal effect on downstream analyses. While joint calling may be a more efficient approach for genotyping many samples, in situations that preclude large sample sizes, our study suggests that single-sample calling is a suitable alternative.
ContributorsHowell, Emma (Co-author) / Wilson, Melissa (Thesis director) / Stone, Anne (Committee member) / Phung, Tanya (Committee member) / School of Life Sciences (Contributor) / Barrett, The Honors College (Contributor)
Created2020-05
131069-Thumbnail Image.png
Description
Pathway analysis helps researchers gain insight into the biology behind gene expression-based data. By applying this data to known biological pathways, we can learn about mutations or other changes in cellular function, such as those seen in cancer. There are many tools that can be used to analyze pathways; however,

Pathway analysis helps researchers gain insight into the biology behind gene expression-based data. By applying this data to known biological pathways, we can learn about mutations or other changes in cellular function, such as those seen in cancer. There are many tools that can be used to analyze pathways; however, it can be difficult to find and learn about the which tool is optimal for use in a certain experiment. This thesis aims to comprehensively review four tools, Cytoscape, PaxtoolsR, PathOlogist, and Reactome, and their role in pathway analysis. This is done by applying a known microarray data set to each tool and testing their different functions. The functions of these programs will then be analyzed to determine their roles in learning about biology and assisting new researchers with their experiments. It was found that each tools holds a very unique and important role in pathway analysis. Visualization pathways have the role of exploring individual pathways and interpreting genomic results. Quantification pathways use statistical tests to determine pathway significance. Together one can find pathways of interest and then explore areas of interest.
ContributorsRehling, Thomas Evan (Author) / Buetow, Kenneth (Thesis director) / Wilson, Melissa (Committee member) / School of Life Sciences (Contributor, Contributor) / Barrett, The Honors College (Contributor)
Created2020-05