Search Content

Using Natural Language Processing to Identify Questions and Answers Written by People Addicted to Opioids

Description

Background: Natural Language Processing models have been trained to locate questions and answers in forum settings before but on topics such as cancer and diabetes. Also, studies have used filtering methods to understand themes in forum settings regarding opioid use. However, studies have not been conducted regarding training an NLP…

Background: Natural Language Processing models have been trained to locate questions and answers in forum settings before but on topics such as cancer and diabetes. Also, studies have used filtering methods to understand themes in forum settings regarding opioid use. However, studies have not been conducted regarding training an NLP model to locate the questions people addicted to opioids are asking their peers and the answers they are receiving in forums. There are a variety of annotation tools available to help aid the data collection to train NLP models. For academic purposes, brat is the best tool for this purpose. This study will inform clinical practice by indicating what the inner thoughts of their patients who are addicted to opioids are so that they will be able to have more meaningful conversations during appointments that the patient may be too afraid to start.

Methods: The standard NLP process was used for this study in which a gold standard was reached through matched paired annotations of the forum text in brat and a neural network was trained on the content. Following the annotation process, adjudication occurred to increase the inter-annotator agreement. Categories were developed by local physicians to describe the questions and three pilots were run to test the best way to categorize the questions.

Results: The inter-annotator agreement, calculated via F-score, before adjudication for a 0.7 threshold was 0.378 for the annotation activity. After adjudication at a threshold of 0.7, the inter-annotator agreement increased to 0.560. Pilots 1, 2, and 3 of the categorization activity had an inter-annotator agreement of 0.375, 0.5, and 0.966 respectively.

Discussion: The inter-annotator agreement of the annotation activity may have been low initially since the annotators were students who may have not been as invested in the project as necessary to accurately annotate the text. Also, as everyone interprets the text slightly differently, it is possible that that contributed to the differences in the matched pairs’ annotations. The F-score variation for the categorization activity partially had to do with different delivery systems of the instructions and partially with the area of study of the participants. The first pilot did not mandate the use of the original context located in brat and the instructions were provided in the form of a downloadable document. The participants were computer science graduate students. The second pilot also had the instructions delivered via a document, but it was strongly suggested that the context be used to gain an understanding of the questions’ meanings. The participants were also computer science graduate students who upon a discussion of their results after the pilot expressed that they did not have a good understanding of the medical jargon in the posts. The final pilot used a combination of students with and without medical background, required to use the context, and included verbal instructions in combination with the written ones. The combination of these factors increased the F-score significantly. For a full-scale experiment, students with a medical background should be used to categorize the questions.

ContributorsPawlik, Katie (Author) / Devarakonda, Murthy (Thesis director) / Murcko, Anita (Committee member) / Green, Ellen (Committee member) / College of Health Solutions (Contributor) / Barrett, The Honors College (Contributor)

Created2019-12

Behavioral Health Providers’ Perspectives on Granular Information Sharing

Description

Granular information sharing allows patients to have more control over their medical records by giving them the choice of what information to share and with whom. Numerous studies have focused on patients’ perspectives, but this study focuses on the provider views of granular information sharing. Twenty-eight behavioral health providers (n=3…

Granular information sharing allows patients to have more control over their medical records by giving them the choice of what information to share and with whom. Numerous studies have focused on patients’ perspectives, but this study focuses on the provider views of granular information sharing. Twenty-eight behavioral health providers (n=3 prescribers, n=25 non-prescribers) from two different integrated healthcare facilities participated in a 2-hour focus group and took a survey at the beginning and at the end of the focus group. The survey responses were analyzed using descriptive analysis to understand how the providers' preferences changed from the pre-study to the post-study. Most providers changed their view about granular information sharing, as 30% of providers “were OK with patients having control over who sees what information in their electronic health record”, previously 83%. Overall, health care providers had concerns that granular information sharing because they feared that it would lead to increased costs, patient safety issues involving drug-drug interactions, and poor provider-patient relationships.

ContributorsIdouraine, Nassim Charif (Author) / Grando, Adela (Thesis director) / Murcko, Anita (Committee member) / College of Health Solutions (Contributor) / Barrett, The Honors College (Contributor)

Created2019-12

Image Reference Extraction: Linking References to Radiology Images with Clinically Significant Findings

Description

Background: Pulmonary embolism is a deadly condition that is often diagnosed using a technique known as computed tomography pulmonary angiography (CTPA). CTPA reports are free-text, narrative-style forms of documentation conferring radiologist findings—both primary (regarding pulmonary embolism) and incidental. This project seeks to combine simple natural language processing (NLP) techniques, such…

Background: Pulmonary embolism is a deadly condition that is often diagnosed using a technique known as computed tomography pulmonary angiography (CTPA). CTPA reports are free-text, narrative-style forms of documentation conferring radiologist findings—both primary (regarding pulmonary embolism) and incidental. This project seeks to combine simple natural language processing (NLP) techniques, such as regular expressions and rules, to build upon and
further process output from a machine learning based named entity recognition (NER) tool for the purposes of (1) linking references to radiological images with the corresponding clinical findings and (2) extracting primary and incidental findings.

Methods: The project’s system utilized a regular expression to extract image references. All CTPA reports were first processed with NER software to obtain the text and spans of clinical findings. A heuristic was used to determine the appropriate clinical finding that should be linked with a particular image reference. Another regular expression was used to extract primary findings from NER output; the remaining findings were considered incidental. Performance was
assessed against a gold standard, which was based upon a manually annotated version of the CTPA reports used in this project.

Results: Extraction of image references achieved a 100% accuracy. Linkages between these references and exact gold standard spans of the clinical findings achieved a precision of 0.24, a recall of 0.22, and an F1 score of 0.23. Linkages with partial spans of clinical findings as determined by the gold standard achieved a precision of 0.71, a recall of 0.67, and an F1 score of 0.69. Primary and incidental finding extraction achieved a precision of 0.67, a recall of 0.80, and
an F1 score of 0.73.

Discussion: Various elements reduced system performance such as the difficulty of exactly matching the spans of clinical findings from NER output with those found in the gold standard. The heuristic linking clinical findings and image references was especially sensitive to NER false positives and false negatives due to its assumption that the appropriate clinical finding was that which was immediately prior to the image reference. Although the system did not perform as well as hoped, lessons were learned such as the need for clear research methodology and proper gold standard creation; without a proper gold standard, problem scope and system performance cannot be properly assessed. Improvements to the system include creating a more robust heuristic, sifting NER false positives, and training the NER tool used on a dataset of CTPA reports.

ContributorsBorlongan, Matthew Bilog (Author) / Devarakonda, Murthy (Thesis director) / Murcko, Anita (Committee member) / College of Health Solutions (Contributor, Contributor) / Barrett, The Honors College (Contributor)

Created2019-05

Evaluating Patient Perceptions and Preferences on Granular Data Sharing in Behavioral Health Patients

Description

Individual control of sensitive health information is a matter of great concern to patients, practitioners, insurers and policymakers. Federal and state law generally supports consent approaches that allow patients to share all or none of their health data. However, research demonstrates that patients prefer more detailed control of their personal…

Individual control of sensitive health information is a matter of great concern to patients, practitioners, insurers and policymakers. Federal and state law generally supports consent approaches that allow patients to share all or none of their health data. However, research demonstrates that patients prefer more detailed control of their personal data sharing. In particular, little is known about data sharing preferences of patients with behavioral health conditions (BHCs). This study will explore the technical feasibility of supporting patient-driven, consent-based data access through a preliminary analysis of data collected from the My Data Choices e-consent tool. Through these findings, this research seeks to inform stakeholders about the clinical, ethical, policy, and regulatory implications of broader consent choices.

ContributorsKaing, Tina C. (Author) / Grando, Maria Adela (Thesis director) / Murcko, Anita (Committee member) / College of Health Solutions (Contributor, Contributor) / Barrett, The Honors College (Contributor)

Created2020-12

Barrett, The Honors College Thesis/Creative Project Collection

Filtering by

Using Natural Language Processing to Identify Questions and Answers Written by People Addicted to Opioids

Behavioral Health Providers’ Perspectives on Granular Information Sharing

Image Reference Extraction: Linking References to Radiology Images with Clinically Significant Findings

Evaluating Patient Perceptions and Preferences on Granular Data Sharing in Behavioral Health Patients