Matching Items (3)
Filtering by

Clear all filters

Description

On January 5, 2020, the World Health Organization (WHO) reported on the outbreak of pneumonia of unknown cause in Wuhan, China. Two weeks later, a 35-year-old Washington resident checked into a local urgent care clinic with a 4-day cough and fever. Laboratory testing would confirm this individual as the first

On January 5, 2020, the World Health Organization (WHO) reported on the outbreak of pneumonia of unknown cause in Wuhan, China. Two weeks later, a 35-year-old Washington resident checked into a local urgent care clinic with a 4-day cough and fever. Laboratory testing would confirm this individual as the first case of the novel coronavirus in the U.S., and on January 20, 2020, the Center for Disease Control (CDC) reported this case to the public. In the days and weeks to follow, Twitter, a social media platform with 450 million active monthly users as of 2020, provided many American residents the opportunity to share their thoughts on the developing pandemic online. Social media sites like Twitter are a prominent source of discourse surrounding contemporary political issues, allowing for direct communication between users in real-time. As more population centers around the world gain access to the internet, most democratic discussion, both nationally and internationally, will take place in online spaces. The activity of elected officials as private citizens in these online spaces is often overlooked. I find the ability of publics—which philosopher John Dewey defines as groups of people with shared needs—to communicate effectively and monitor the interests of political elites online to be lacking. To best align the interests of officials and citizens, and achieve transparency between publics and elected officials, we need an efficient way to measure and record these interests. Through this thesis, I found that natural language processing methods like sentiment analyses can provide an effective means of gauging the attitudes of politicians towards contemporary issues.

ContributorsHowell, Nicholas (Author) / Voorhees, Matthew (Thesis director) / Schmidt, Peter (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / School of Life Sciences (Contributor)
Created2023-05
166246-Thumbnail Image.png
Description
In the age of information, collecting and processing large amounts of data is an integral part of running a business. From training artificial intelligence to driving decision making, the applications of data are far-reaching. However, it is difficult to process many types of data; namely, unstructured data. Unstructured data is

In the age of information, collecting and processing large amounts of data is an integral part of running a business. From training artificial intelligence to driving decision making, the applications of data are far-reaching. However, it is difficult to process many types of data; namely, unstructured data. Unstructured data is “information that either does not have a predefined data model or is not organized in a pre-defined manner” (Balducci & Marinova 2018). Such data are difficult to put into spreadsheets and relational databases due to their lack of numeric values and often come in the form of text fields written by the consumers (Wolff, R. 2020). The goal of this project is to help in the development of a machine learning model to aid CommonSpirit Health and ServiceNow, hence why this approach using unstructured data was selected. This paper provides a general overview of the process of unstructured data management and explores some existing implementations and their efficacy. It will then discuss our approach to converting unstructured cases into usable data that were used to develop an artificial intelligence model which is estimated to be worth $400,000 and save CommonSpirit Health $1,200,000 in organizational impact.
ContributorsBergsagel, Matteo (Author) / De Waard, Jan (Co-author) / Chavez-Echeagaray, Maria Elena (Thesis director) / Burns, Christopher (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Computer Science and Engineering Program (Contributor)
Created2022-05
Description
Natural Language Processing (NLP) techniques have increasingly been used in finance, accounting, and economics research to analyze text-based information more efficiently and effectively than primarily human-centered methods. The literature is rich with computational textual analysis techniques applied to consistent annual or quarterly financial fillings, with promising results to identify similarities

Natural Language Processing (NLP) techniques have increasingly been used in finance, accounting, and economics research to analyze text-based information more efficiently and effectively than primarily human-centered methods. The literature is rich with computational textual analysis techniques applied to consistent annual or quarterly financial fillings, with promising results to identify similarities between documents and firms, in addition to further using this information in relation to other economic phenomena. Building upon the knowledge gained from previous research and extending the application of NLP methods to other categories of financial documents, this project explores financial credit contracts, better understanding the information provided through their textual data by assessing patterns and relationships between documents and firms. The main methods used throughout this project is Term Frequency-Inverse Document Frequency (to represent each document as a numerical vector), Cosine Similarity (to measure the similarity between contracts), and K-Means Clustering (to organically derive clusters of documents based on the text included in the contract itself). Using these methods, the dimensions analyzed are various grouping methodologies (external industry classifications and text derived classifications), various granularities (document-wise and firm-wise), various financial documents associated with a single firm (the relationship between credit contracts and 10-K product descriptions), and how various mean cosine similarity distributions change over time.
ContributorsLiu, Jeremy J (Author) / Wahal, Sunil (Thesis director) / Bharath, Sreedhar (Committee member) / School of Mathematical and Statistical Sciences (Contributor) / School for the Future of Innovation in Society (Contributor) / Barrett, The Honors College (Contributor)
Created2020-05