Filtering by
- All Subjects: Machine Learning
- Creators: Bansal, Ajay
- Member of: Theses and Dissertations
- Member of: Barrett, The Honors College Thesis/Creative Project Collection
The goal of this project is to develop a deeper understanding of how machine learning pertains to the business world and how business professionals can capitalize on its capabilities. It explores the end-to-end process of integrating a machine and the tradeoffs and obstacles to consider. This topic is extremely pertinent today as the advent of big data increases and the use of machine learning and artificial intelligence is expanding across industries and functional roles. The approach I took was to expand on a project I championed as a Microsoft intern where I facilitated the integration of a forecasting machine learning model firsthand into the business. I supplement my findings from the experience with research on machine learning as a disruptive technology. This paper will not delve into the technical aspects of coding a machine model, but rather provide a holistic overview of developing the model from a business perspective. My findings show that, while the advantages of machine learning are large and widespread, a lack of visibility and transparency into the algorithms behind machine learning, the necessity for large amounts of data, and the overall complexity of creating accurate models are all tradeoffs to consider when deciding whether or not machine learning is suitable for a certain objective. The results of this paper are important in order to increase the understanding of any business professional on the capabilities and obstacles of integrating machine learning into their business operations.
In this thesis, several different methods for detecting and removing satellite streaks from astronomic images were evaluated and compared with a new machine learning based approach. Simulated data was generated with a variety of conditions, and the performance of each method was evaluated both quantitatively, using Mean Absolute Error (MAE) against a ground truth detection mask and processing throughput of the method, as well as qualitatively, examining the situations in which each model performs well and poorly. Detection methods from existing systems Pyradon and ASTRiDE were implemented and tested. A machine learning (ML) image segmentation model was trained on simulated data and used to detect streaks in test data. The ML model performed favorably relative to the traditional methods tested, and demonstrated superior robustness in general. However, the model also exhibited some unpredictable behavior in certain scenarios which should be considered. This demonstrated that machine learning is a viable tool for the detection of satellite streaks in astronomic images, however special care must be taken to prevent and to minimize the effects of unpredictable behavior in such models.
The main focus of this thesis is to use visual description of a landmark by choosing the most diverse pictures that best describe all the details of the queried location from community-contributed datasets. For this, an end-to-end framework has been built, to retrieve relevant results that are also diverse. Different retrieval re-ranking and diversification strategies are evaluated to find a balance between relevance and diversification. Clustering techniques are employed to improve divergence. A unique fusion approach has been adopted to overcome the dilemma of selecting an appropriate clustering technique and the corresponding parameters, given a set of data to be investigated. Extensive experiments have been conducted on the Flickr Div150Cred dataset that has 30 different landmark locations. The results obtained are promising when evaluated on metrics for relevance and diversification.
This project aims to incorporate the aspect of sentiment analysis into traditional stock analysis to enhance stock rating predictions by applying a reliance on the opinion of various stocks from the Internet. Headlines from eight major news publications and conversations from Yahoo! Finance’s “Conversations” feature were parsed through the Valence Aware Dictionary for Sentiment Reasoning (VADER) natural language processing package to determine numerical polarities which represented positivity or negativity for a given stock ticker. These generated polarities were paired with stock metrics typically observed by stock analysts as the feature set for a Logistic Regression machine learning model. The model was trained on roughly 1500 major stocks to determine a binary classification between a “Buy” or “Not Buy” rating for each stock, and the results of the model were inserted into the back-end of the Agora Web UI which emulates search engine behavior specifically for stocks found in NYSE and NASDAQ. The model reported an accuracy of 82.5% and for most major stocks, the model’s prediction correlated with stock analysts’ ratings. Given the volatility of the stock market and the propensity for hive-mind behavior in online forums, the performance of the Logistic Regression model would benefit from incorporating historical stock data and more sources of opinion to balance any subjectivity in the model.