Search Content

A/B Testing-based Recommendation Systems

Description

Recommendation systems provide recommendations based on user behavior andcontent data. User behavior and content data are fed to machine learning algorithms to train them and give recommendations to the users. These algorithms need a large amount of data for a reasonable conversion rate. But for small applications, the available amount of data is…

Recommendation systems provide recommendations based on user behavior andcontent data. User behavior and content data are fed to machine learning algorithms to train them and give recommendations to the users. These algorithms need a large amount of data for a reasonable conversion rate. But for small applications, the available amount of data is minimal, leading to high recommendation aberrations. Also, when an existing large scaled application with a high amount of available data uses a new recommendation system, it requires some time and testing to decide which recommendation algorithm is best suited to get higher conversion rates. This learning curve costs highly when the user base and data size are significantly high. In this thesis, A/B testing is used with manual intervention in the decision-making of recommendation systems. To understand the effectiveness of the recommendations, user interaction data is compared to compare experiences. Based on the comparisons, the experiments conclude the effectiveness of A/B testing for the recommendation system.

ContributorsVaidya, Yogesh Vinayak (Author) / Bansal, Ajay (Thesis advisor) / Findler, Michael (Committee member) / Chakravarthi, Bharatesh (Committee member) / Arizona State University (Publisher)

Created2023

Transformer-based Automatic Mapping of Clinical Notes to Specific Clinical Concepts

Description

A significant proportion of medical errors exist in crucial medical information, and most stem from misinterpreting non-standardized clinical notes. Clinical Skills exam offered by the United States Medical Licensing Examination (USMLE) was put in place to certify patient note-taking skills before medical students joined professional practices, offering the first line…

A significant proportion of medical errors exist in crucial medical information, and most stem from misinterpreting non-standardized clinical notes. Clinical Skills exam offered by the United States Medical Licensing Examination (USMLE) was put in place to certify patient note-taking skills before medical students joined professional practices, offering the first line of defense in protecting patients from medical errors. Nonetheless, the exams were discontinued in 2021 following high costs and resource usage in scoring the exams. This thesis compares four transformer-based models, namely BERT (Bidirectional Encoder Representations from Transformers) Base Uncased, Emilyalsentzer Bio_ClinicalBERT, RoBERTa (Robustly Optimized BERT Pre-Training Approach), and DeBERTa (Decoding-enhanced BERT with disentangled attention), with the goal to map free text in patient notes to clinical concepts present in the exam rubric. The impact of context-specific embeddings on BERT was also studied to determine the need for a clinical BERT in Clinical Skills exam. This thesis proposes the use of DeBERTa as a backbone model in patient note scoring for the USMLE Clinical Skills exam after comparing it with three other transformer models. Disentangled attention and enhanced mask decoder integrated into DeBERTa were credited for the high performance of DeBERTa as compared to the other models. Besides, the effect of meta pseudo labeling was also investigated in this thesis, which in turn, further enhanced DeBERTa’s performance.

ContributorsGanesh, Jay (Author) / Bansal, Ajay (Thesis advisor) / Mehlhase, Alexandra (Committee member) / Findler, Michael (Committee member) / Arizona State University (Publisher)

Created2022

Diversifying Relevant Search Results from Social Media Using Community Contributed Images

Description

Availability of affordable image and video capturing devices as well as rapid development of social networking and content sharing websites has led to the creation of new type of content, Social Media. Any system serving the end user’s query search request should not only take the relevant images into consideration…

Availability of affordable image and video capturing devices as well as rapid development of social networking and content sharing websites has led to the creation of new type of content, Social Media. Any system serving the end user’s query search request should not only take the relevant images into consideration but they also need to be divergent for a well-rounded description of a query. As a result, the automated optimization of image retrieval results that are also divergent becomes exceedingly important.

The main focus of this thesis is to use visual description of a landmark by choosing the most diverse pictures that best describe all the details of the queried location from community-contributed datasets. For this, an end-to-end framework has been built, to retrieve relevant results that are also diverse. Different retrieval re-ranking and diversification strategies are evaluated to find a balance between relevance and diversification. Clustering techniques are employed to improve divergence. A unique fusion approach has been adopted to overcome the dilemma of selecting an appropriate clustering technique and the corresponding parameters, given a set of data to be investigated. Extensive experiments have been conducted on the Flickr Div150Cred dataset that has 30 different landmark locations. The results obtained are promising when evaluated on metrics for relevance and diversification.

ContributorsKalakota, Vaibhav Reddy (Author) / Bansal, Ajay (Thesis advisor) / Bansal, Srividya (Committee member) / Findler, Michael (Committee member) / Arizona State University (Publisher)

Created2020

Lossless Data Compression by Representing Data as a Solution to the Diophantine Equations

Description

There has been a substantial development in the field of data transmission in the last two decades. One does not have to wait much for a high-definition video to load on the systems anymore. Data compression is one of the most important technologies that helped achieve this seamless data transmission…

There has been a substantial development in the field of data transmission in the last two decades. One does not have to wait much for a high-definition video to load on the systems anymore. Data compression is one of the most important technologies that helped achieve this seamless data transmission experience. It helps to store or send more data using less memory or network resources. However, it appears that there is a limit on the amount of compression that can be achieved with the existing lossless data compression techniques because they rely on the frequency of characters or set of characters in the data. The thesis proposes a lossless data compression technique in which the data is compressed by representing it as a set of parameters that can reproduce the original data without any loss when given to the corresponding mathematical equation. The mathematical equation used in the thesis is the sum of the first N terms in a geometric series. Various changes are made to this mathematical equation so that any given data can be compressed and decompressed. According to the proposed technique, the whole data is taken as a single decimal number and replaced with one of the terms of the used equation. All the other terms of the equation are computed and stored as a compressed file. The performance of the developed technique is evaluated in terms of compression ratio, compression time and decompression time. The evaluation metrics are then compared with the other existing techniques of the same domain.

ContributorsGrewal, Karandeep Singh (Author) / Gonzalez Sanchez, Javier (Thesis advisor) / Bansal, Ajay (Committee member) / Findler, Michael (Committee member) / Arizona State University (Publisher)

Created2021

ASU Electronic Theses and Dissertations

Filtering by

A/B Testing-based Recommendation Systems

Transformer-based Automatic Mapping of Clinical Notes to Specific Clinical Concepts

Diversifying Relevant Search Results from Social Media Using Community Contributed Images

Lossless Data Compression by Representing Data as a Solution to the Diophantine Equations