Search Content

Drosophila stage annotation using sparse learning method

Description

Drosophila melanogaster, as an important model organism, is used to explore the mechanism which governs cell differentiation and embryonic development. Understanding the mechanism will help to reveal the effects of genes on other species or even human beings. Currently, digital camera techniques make high quality Drosophila gene expression imaging possible.…

Drosophila melanogaster, as an important model organism, is used to explore the mechanism which governs cell differentiation and embryonic development. Understanding the mechanism will help to reveal the effects of genes on other species or even human beings. Currently, digital camera techniques make high quality Drosophila gene expression imaging possible. On the other hand, due to the advances in biology, gene expression images which can reveal spatiotemporal patterns are generated in a high-throughput pace. Thus, an automated and efficient system that can analyze gene expression will become a necessary tool for investigating the gene functions, interactions and developmental processes. One investigation method is to compare the expression patterns of different developmental stages. Recently, however, the expression patterns are manually annotated with rough stage ranges. The work of annotation requires professional knowledge from experienced biologists. Hence, how to transfer the domain knowledge in biology into an automated system which can automatically annotate the patterns provides a challenging problem for computer scientists. In this thesis, the problem of stage annotation for Drosophila embryo is modeled in the machine learning framework. Three sparse learning algorithms and one ensemble algorithm are used to attack the problem. The sparse algorithms are Lasso, group Lasso and sparse group Lasso. The ensemble algorithm is based on a voting method. Besides that the proposed algorithms can annotate the patterns to stages instead of stage ranges with high accuracy; the decimal stage annotation algorithm presents a novel way to annotate the patterns to decimal stages. In addition, some analysis on the algorithm performance are made and corresponding explanations are given. Finally, with the proposed system, all the lateral view BDGP and FlyFish images are annotated and several interesting applications of decimal stage value are revealed.

ContributorsPan, Cheng (Author) / Ye, Jieping (Thesis advisor) / Li, Baoxin (Committee member) / Farin, Gerald (Committee member) / Arizona State University (Publisher)

Created2012

A semantic triplet based story classifier

Description

Text classification, in the artificial intelligence domain, is an activity in which text documents are automatically classified into predefined categories using machine learning techniques. An example of this is classifying uncategorized news articles into different predefined categories such as "Business", "Politics", "Education", "Technology" , etc. In this thesis, supervised machine…

Text classification, in the artificial intelligence domain, is an activity in which text documents are automatically classified into predefined categories using machine learning techniques. An example of this is classifying uncategorized news articles into different predefined categories such as "Business", "Politics", "Education", "Technology" , etc. In this thesis, supervised machine learning approach is followed, in which a module is first trained with pre-classified training data and then class of test data is predicted. Good feature extraction is an important step in the machine learning approach and hence the main component of this text classifier is semantic triplet based features in addition to traditional features like standard keyword based features and statistical features based on shallow-parsing (such as density of POS tags and named entities). Triplet {Subject, Verb, Object} in a sentence is defined as a relation between subject and object, the relation being the predicate (verb). Triplet extraction process, is a 5 step process which takes input corpus as a web text document(s), each consisting of one or many paragraphs, from RSS feeds to lists of extremist website. Input corpus feeds into the "Pronoun Resolution" step, which uses an heuristic approach to identify the noun phrases referenced by the pronouns. The next step "SRL Parser" is a shallow semantic parser and converts the incoming pronoun resolved paragraphs into annotated predicate argument format. The output of SRL parser is processed by "Triplet Extractor" algorithm which forms the triplet in the form {Subject, Verb, Object}. Generalization and reduction of triplet features is the next step. Reduced feature representation reduces computing time, yields better discriminatory behavior and handles curse of dimensionality phenomena. For training and testing, a ten- fold cross validation approach is followed. In each round SVM classifier is trained with 90% of labeled (training) data and in the testing phase, classes of remaining 10% unlabeled (testing) data are predicted. Concluding, this paper proposes a model with semantic triplet based features for story classification. The effectiveness of the model is demonstrated against other traditional features used in the literature for text classification tasks.

ContributorsKarad, Ravi Chandravadan (Author) / Davulcu, Hasan (Thesis advisor) / Corman, Steven (Committee member) / Sen, Arunabha (Committee member) / Arizona State University (Publisher)

Created2013

Computational methods for knowledge integration in the analysis of large-scale biological networks

Description

As we migrate into an era of personalized medicine, understanding how bio-molecules interact with one another to form cellular systems is one of the key focus areas of systems biology. Several challenges such as the dynamic nature of cellular systems, uncertainty due to environmental influences, and the heterogeneity between individual…

As we migrate into an era of personalized medicine, understanding how bio-molecules interact with one another to form cellular systems is one of the key focus areas of systems biology. Several challenges such as the dynamic nature of cellular systems, uncertainty due to environmental influences, and the heterogeneity between individual patients render this a difficult task. In the last decade, several algorithms have been proposed to elucidate cellular systems from data, resulting in numerous data-driven hypotheses. However, due to the large number of variables involved in the process, many of which are unknown or not measurable, such computational approaches often lead to a high proportion of false positives. This renders interpretation of the data-driven hypotheses extremely difficult. Consequently, a dismal proportion of these hypotheses are subject to further experimental validation, eventually limiting their potential to augment existing biological knowledge. This dissertation develops a framework of computational methods for the analysis of such data-driven hypotheses leveraging existing biological knowledge. Specifically, I show how biological knowledge can be mapped onto these hypotheses and subsequently augmented through novel hypotheses. Biological hypotheses are learnt in three levels of abstraction -- individual interactions, functional modules and relationships between pathways, corresponding to three complementary aspects of biological systems. The computational methods developed in this dissertation are applied to high throughput cancer data, resulting in novel hypotheses with potentially significant biological impact.

ContributorsRamesh, Archana (Author) / Kim, Seungchan (Thesis advisor) / Langley, Patrick W (Committee member) / Baral, Chitta (Committee member) / Kiefer, Jeffrey (Committee member) / Arizona State University (Publisher)

Created2012

Temperature Dependence of PV Fault Detection Neural Networks

Description

This study measure the effect of temperature on a neural network's ability to detect and classify solar panel faults. It's well known that temperature negatively affects the power output of solar panels. This has consequences on their output data and our ability to distinguish between conditions via machine learning.

ContributorsVerch, Skyler (Author) / Spanias, Andreas (Thesis director) / Tepedelenlioğlu, Cihan (Committee member) / Barrett, The Honors College (Contributor) / Electrical Engineering Program (Contributor)

Created2022-12

LEARNING FREE ENERGY PATHWAYS THROUGH DEEP LEARNING

Description

The focus of my honors thesis is to find ways to use deep learning in tandem with tools in statistical mechanics to derive new ways to solve problems in biophysics. More specifically, I’ve been interested in finding transition pathways between two known states of a biomolecule. This is because understanding…

The focus of my honors thesis is to find ways to use deep learning in tandem with tools in statistical mechanics to derive new ways to solve problems in biophysics. More specifically, I’ve been interested in finding transition pathways between two known states of a biomolecule. This is because understanding the mechanisms in which proteins fold and ligands bind is crucial to creating new medicines and understanding biological processes. In this thesis, I work with individuals in the Singharoy lab to develop a formulation to utilize reinforcement learning and sampling-based robotics planning to derive low free energy transition pathways between two known states. Our formulation uses Jarzynski’s equality and the stiff-spring approximation to obtain point estimates of energy, and construct an informed path search with atomistic resolution. At the core of this framework, is our first ever attempt we use a policy driven adaptive steered molecular dynamics (SMD) to control our molecular dynamics simulations. We show that both the reinforcement learning (RL) and robotics planning realization of the RL-guided framework can solve for pathways on toy analytical surfaces and alanine dipeptide.

ContributorsHo, Nicholas (Author) / Maciejewski, Ross (Thesis director) / Singharoy, Abhishek (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Computer Science and Engineering Program (Contributor)

Created2022-12

Predict NFL Players Points for Fantasy Football

Description

For my Honors Thesis, I decided to create an Artificial Intelligence Project to predict Fantasy NFL Football Points of players and team's defense. I created a Tensorflow Keras AI Regression model and created a Flask API that holds the AI model, and a Django Try-It Page for the user to…

For my Honors Thesis, I decided to create an Artificial Intelligence Project to predict Fantasy NFL Football Points of players and team's defense. I created a Tensorflow Keras AI Regression model and created a Flask API that holds the AI model, and a Django Try-It Page for the user to use the model. These services are hosted on ASU's AWS service. In my Flask API, it actively gathers data from Pro-Football-Reference, then calculates the fantasy points. Let’s say the current year is 2022, then the model analyzes each player and trains on all data from available from 2000 to 2020 data, tests the data on 2021 data, and predicts for 2022 year. The Django Website asks the user to input the current year, then the user clicks the submit button runs the AI model, and the process explained earlier. Next, the user enters the player's name for the point prediction and the website predicts the last 5 rows with 4 being the previous fantasy points and the 5th row being the prediction.

ContributorsPanikulam, Caleb (Author) / De Luca, Gennaro (Thesis director) / Chen, Yinong (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Computer Science and Engineering Program (Contributor)

Created2022-12

The Unbounded Magic of Machine Learning

Description

With the increasing presence and importance of machine learning, artificial intelligence, and big data in our daily lives, there comes the necessity to re-evaluate how magical, enchanted lines of thinking may or may not survive alongside the turn of the century. There exists a set of connections between magic and…

With the increasing presence and importance of machine learning, artificial intelligence, and big data in our daily lives, there comes the necessity to re-evaluate how magical, enchanted lines of thinking may or may not survive alongside the turn of the century. There exists a set of connections between magic and the aforementioned field of technology, in that this specific field has the potential to become sufficiently advanced and complex as to cause unpredictable problems down the line. This discussion will explore several different topics ranging from the comparisons between magic and technology to the dangers of these systems being “black box” and rather ambiguous in how they turn data input into prediction output, all central to the idea that this increasingly tech-focused world should be thought about in a magical and re-enchanted way, especially as legislation is drafted up and decided upon that can determine how these impressive new technologies will be regulated going forward.

ContributorsRodi, Michael (Author) / Ostling, Michael (Thesis director) / Blanco, Eduardo (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2022-05

Elliptic Fourier Features for Robustness to Rotations and Translations in Neural Networks

Description

In image classification tasks, images are often corrupted by spatial transformationslike translations and rotations. In this work, I utilize an existing method that uses the Fourier series expansion to generate a rotation and translation invariant representation of closed contours found in sketches, aiming to attenuate the effects of distribution shift caused…

In image classification tasks, images are often corrupted by spatial transformationslike translations and rotations. In this work, I utilize an existing method that uses the Fourier series expansion to generate a rotation and translation invariant representation of closed contours found in sketches, aiming to attenuate the effects of distribution shift caused by the aforementioned transformations. I use this technique to transform input images into one of two different invariant representations, a Fourier series representation and a corrected raster image representation, prior to passing them to a neural network for classification. The architectures used include convolutional neutral networks (CNNs), multi-layer perceptrons (MLPs), and graph neural networks (GNNs). I compare the performance of this method to using data augmentation during training, the standard approach for addressing distribution shift, to see which strategy yields the best performance when evaluated against a test set with rotations and translations applied. I include experiments where the augmentations applied during training both do and do not accurately reflect the transformations encountered at test time. Additionally, I investigate the robustness of both approaches to high-frequency noise. In each experiment, I also compare training efficiency across models. I conduct experiments on three data sets, the MNIST handwritten digit dataset, a custom dataset (QD-3) consisting of three classes of geometric figures from the Quick, Draw! hand-drawn sketch dataset, and another custom dataset (QD-345) featuring sketches from all 345 classes found in Quick, Draw!. On the smaller problem space of MNIST and QD-3, the networks utilizing the Fourier-based technique to attenuate distribution shift perform competitively with the standard data augmentation strategy. On the more complex problem space of QD-345, the networks using the Fourier technique do not achieve the same test performance as correctly-applied data augmentation. However, they still outperform instances where train-time augmentations mis-predict test-time transformations, and outperform a naive baseline model where no strategy is used to attenuate distribution shift. Overall, this work provides evidence that strategies which attempt to directly mitigate distribution shift, rather than simply increasing the diversity of the training data, can be successful when certain conditions hold.

ContributorsWatson, Matthew (Author) / Yang, Yezhou YY (Thesis advisor) / Kerner, Hannah HK (Committee member) / Yang, Yingzhen YY (Committee member) / Arizona State University (Publisher)

Created2023

The Reliability of Predictive Models in Esports -- Using Methods of Linear Algebra and Machine Learning

Description

This project is centered around a decade-old video game called League of Legends, which is one of the most popular video games in esports. Due to its nature of being a complex team-based strategy game, intuitive human predictions of the game’s outcome are relatively unreliable. Many approaches have been adopted…

This project is centered around a decade-old video game called League of Legends, which is one of the most popular video games in esports. Due to its nature of being a complex team-based strategy game, intuitive human predictions of the game’s outcome are relatively unreliable. Many approaches have been adopted to assist intuitive human predictions in traditional team-based sports, such as the Least Squares Method and various supervised machine learning algorithms. These methods have been significantly outperforming human predictions. The objective of this research is, hence, to test whether the predictive models generated using these methods can achieve a similar level of reliability in a more complex game like League of Legends.

ContributorsWang, Jiahao (Author) / Zandieh, Michelle (Thesis director) / Lee, Inyoung (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor) / College of Integrative Sciences and Arts (Contributor)

Created2023-12

Measuring the use of dynamic circuits on performance metrics of Quantum Neural Networks

Description

The goal of this project is to measure the effects of the use of dynamic circuit technology within quantum neural networks. Quantum neural networks are a type of neural network that utilizes quantum encoding and manipulation techniques to learn to solve a problem using quantum or classical data. In their…

The goal of this project is to measure the effects of the use of dynamic circuit technology within quantum neural networks. Quantum neural networks are a type of neural network that utilizes quantum encoding and manipulation techniques to learn to solve a problem using quantum or classical data. In their current form these neural networks are linear in nature, not allowing for alternative execution paths, but using dynamic circuits they can be made nonlinear and can execute different paths. We measured the effects of these dynamic circuits on the training time, accuracy, and effective dimension of the quantum neural network across multiple trials to see the impacts of the nonlinear behavior.

ContributorsLynch, Brian (Author) / De Luca, Gennaro (Thesis director) / Chen, Yinong (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2023-12

Filtering by