Matching Items (3)
Filtering by

Clear all filters

134821-Thumbnail Image.png
Description
Mosquito population data is a valuable resource for researchers and public health officials working to limit the spread of deadly zoonotic viruses such as Zika Virus and West Nile Virus. Unfortunately, this data is currently difficult to obtain and aggregate across the United States. Obtaining historical data often requires filing

Mosquito population data is a valuable resource for researchers and public health officials working to limit the spread of deadly zoonotic viruses such as Zika Virus and West Nile Virus. Unfortunately, this data is currently difficult to obtain and aggregate across the United States. Obtaining historical data often requires filing requests to individual States or Counties and hoping for a response. Current online systems available for accessing aggregated data are lacking essential features, or limited in scope. In order to make mosquito population data more accessible for United States researchers, epidemiologists, and public health officials, the MosquitoDB system has been developed. MosquitoDB consists of a JavaScript Web Application, connected to a SQL database, that makes submitting and retrieving United States mosquito population data much simpler and straight forward than alternative systems. The MosquitoDB software project is open source and publically available on GitHub, allowing community scrutiny and contributions to add or improve necessary features. For this Creative Project, the core MosquitoDB system was designed and developed with 3 main features: 1) Web Interface for querying mosquito data. 2) Web Interface for submitting mosquito data. 3) Web Services for querying/retrieving and submitting mosquito data. The Web Interface is essential for common end users, such as researchers and public health officials, to access historical data or submit new data. The Web Services provide building blocks for Web Applications that other developers can use to incorporate data into new applications. The current MosquitoDB system is live at https://zodo.asu.edu/mosquito and the public code repository is available at https://github.com/developerDemetri/mosquitodb.
ContributorsJones-Shargani, Demetrius Paul (Author) / Scotch, Matthew (Thesis director) / Weissenbacher, Davy (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)
Created2016-12
154663-Thumbnail Image.png
Description
Text mining of biomedical literature and clinical notes is a very active field of research in biomedical science. Semantic analysis is one of the core modules for different Natural Language Processing (NLP) solutions. Methods for calculating semantic relatedness of two concepts can be very useful in solutions solving different problems

Text mining of biomedical literature and clinical notes is a very active field of research in biomedical science. Semantic analysis is one of the core modules for different Natural Language Processing (NLP) solutions. Methods for calculating semantic relatedness of two concepts can be very useful in solutions solving different problems such as relationship extraction, ontology creation and question / answering [1–6]. Several techniques exist in calculating semantic relatedness of two concepts. These techniques utilize different knowledge sources and corpora. So far, researchers attempted to find the best hybrid method for each domain by combining semantic relatedness techniques and data sources manually. In this work, attempts were made to eliminate the needs for manually combining semantic relatedness methods targeting any new contexts or resources through proposing an automated method, which attempted to find the best combination of semantic relatedness techniques and resources to achieve the best semantic relatedness score in every context. This may help the research community find the best hybrid method for each context considering the available algorithms and resources.
ContributorsEmadzadeh, Ehsan (Author) / Gonzalez, Graciela (Thesis advisor) / Greenes, Robert (Committee member) / Scotch, Matthew (Committee member) / Arizona State University (Publisher)
Created2016
157992-Thumbnail Image.png
Description
Unstructured texts containing biomedical information from sources such as electronic health records, scientific literature, discussion forums, and social media offer an opportunity to extract information for a wide range of applications in biomedical informatics. Building scalable and efficient pipelines for natural language processing and extraction of biomedical information plays an

Unstructured texts containing biomedical information from sources such as electronic health records, scientific literature, discussion forums, and social media offer an opportunity to extract information for a wide range of applications in biomedical informatics. Building scalable and efficient pipelines for natural language processing and extraction of biomedical information plays an important role in the implementation and adoption of applications in areas such as public health. Advancements in machine learning and deep learning techniques have enabled rapid development of such pipelines. This dissertation presents entity extraction pipelines for two public health applications: virus phylogeography and pharmacovigilance. For virus phylogeography, geographical locations are extracted from biomedical scientific texts for metadata enrichment in the GenBank database containing 2.9 million virus nucleotide sequences. For pharmacovigilance, tools are developed to extract adverse drug reactions from social media posts to open avenues for post-market drug surveillance from non-traditional sources. Across these pipelines, high variance is observed in extraction performance among the entities of interest while using state-of-the-art neural network architectures. To explain the variation, linguistic measures are proposed to serve as indicators for entity extraction performance and to provide deeper insight into the domain complexity and the challenges associated with entity extraction. For both the phylogeography and pharmacovigilance pipelines presented in this work the annotated datasets and applications are open source and freely available to the public to foster further research in public health.
ContributorsMagge, Arjun (Author) / Scotch, Matthew (Thesis advisor) / Gonzalez-Hernandez, Graciela (Thesis advisor) / Greenes, Robert (Committee member) / Arizona State University (Publisher)
Created2019