Barrett, The Honors College Thesis/Creative Project Collection
Barrett, The Honors College at Arizona State University proudly showcases the work of undergraduate honors students by sharing this collection exclusively with the ASU community.
Barrett accepts high performing, academically engaged undergraduate students and works with them in collaboration with all of the other academic units at Arizona State University. All Barrett students complete a thesis or creative project which is an opportunity to explore an intellectual interest and produce an original piece of scholarly research. The thesis or creative project is supervised and defended in front of a faculty committee. Students are able to engage with professors who are nationally recognized in their fields and committed to working with honors students. Completing a Barrett thesis or creative project is an opportunity for undergraduate honors students to contribute to the ASU academic community in a meaningful way.
Filtering by
- All Subjects: Data Analysis
Objective: There were three main objectives of the study. One objective was to elucidate potential new relationships via linear regression. Another objective was to determine which factors were indicative of Type 2 DM in the population. Finally, the last objective was to compare the incidence of Type 2 DM in the dataset to trends seen elsewhere.
Methods: The dataset was uploaded from an open source site with citation onto Python. The dataset, created in 1990, was composed of 768 female patients across 9 different attributes (Number of Pregnancies, Plasma Glucose Levels, Systolic Blood Pressure, Triceps Skin Thickness, Insulin Levels, BMI, Diabetes Pedigree Function, Age and Diabetes Presence (0 or 1)). The dataset was then cleaned using mean or median imputation. Post cleaning, linear regression was done to assess the relationships between certain factors in the population and assessed via the probability statistic for significance, with the exclusion of the Diabetes Pedigree Function and Diabetes Presence. Reverse stepwise logistic regression was used to determine the most pertinent factors for Type 2 DM via the Akaike Information Criterion and through the statistical significance in the model. Finally, data from the Center of Disease Control (CDC) Diabetes Surveillance was assessed for relationships with Female DM Percenatge in Pinal County through Obesity or through Physical Inactivity via simple logistic regression for statistical significance.
Results: The majority of the relationships found were statistically significant with each other. The most pertinent factors of Type 2 DM in the dataset were the number of pregnancies, the plasma glucose levels as well as the Blood Pressure. Via the USDS Data from the CDC, the relationships between Female DM Percentage and the obesity and inactivity percentages were statistically significant.
Conclusion: The trends found in the study matched the trends found in the literature. Per the results, recommendations for better diabetes control include more medical education as well as better blood sugar monitoring.With more analysis, there can be more done for checking other factors such as genetic factors and epidemiological analysis. In conclusion, the study accomplished its main objectives.
The e-commerce market utilizes information to target customers and drive business. More and more online services have become available, allowing consumers to make purchases and interact with an online system. For example, Amazon is one of the largest Internet-based retail companies. As people shop through this website, Amazon gathers huge amounts of data on its customers from personal information to shopping history to viewing history. After purchasing a product, the customer may leave reviews and give a rating based on their experience. Performing analytics on all of this data can provide insights into making more informed business and marketing decisions that can lead to business growth and also improve the customer experience.
For this thesis, I have trained binary classification models on a publicly available product review dataset from Amazon to predict whether a review has a positive or negative sentiment. The sentiment analysis process includes analyzing and encoding the human language, then extracting the sentiment from the resulting values. In the business world, sentiment analysis provides value by revealing insights into customer opinions and their behaviors. In this thesis, I will explain how to perform a sentiment analysis and analyze several different machine learning models. The algorithms for which I compared the results are KNN, Logistic Regression, Decision Trees, Random Forest, Naïve Bayes, Linear Support Vector Machines, and Support Vector Machines with an RBF kernel.