ASU Electronic Theses and Dissertations
This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.
In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.
Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.
Filtering by
- Genre: Masters Thesis
- Creators: Bansal, Ajay
This thesis deals with evaluating the performance of Presto in processing big RDF data against Apache Hive. A comparative analysis was also conducted against 4store, a native RDF store. To evaluate the performance Presto for big RDF data processing, a map-reduce program and a compiler, based on Flex and Bison, were implemented. The map-reduce program loads RDF data into HDFS while the compiler translates SPARQL queries into a subset of SQL that Presto (and Hive) can understand. The evaluation was done on four and eight node Linux clusters installed on Microsoft Windows Azure platform with RDF datasets of size 10, 20, and 30 million triples. The results of the experiment show that Presto has a much higher performance than Hive can be used to process big RDF data. The thesis also proposes an architecture based on Presto, Presto-RDF, that can be used to process big RDF data.
To facilitate rapid, correct, efficient, and intuitive development of graph based solutions we propose a new programming language construct - the search statement. Given a supra-root node, a procedure which determines the children of a given parent node, and optional definitions of the fail-fast acceptance or rejection of a solution, the search statement can conduct a search over any graph or network. Structurally, this statement is modelled after the common switch statement and is put into a largely imperative/procedural context to allow for immediate and intuitive development by most programmers. The Go programming language has been used as a foundation and proof-of-concept of the search statement. A Go compiler is provided which implements this construct.
The tool was developed following the incremental development process in order to quickly create a functional and testable tool. The incremental process also allowed for feedback from radio astronomers to help guide the project's development.
UVLabel provides both a functional product, and a modifiable and scalable code base for radio astronomer developers. This enables astronomers studying various astronomical interferometric data labelling capabilities. The tool can then be used to improve their filtering methods, pursue machine learning solutions, and discover new trends. Finally, UVLabel will be open source to put customization, scalability, and adaptability in the hands of these researchers.
considered a difficult problem to be solved by computers. Image captioning involves not just detecting objects from images but understanding the interactions between the objects to be translated into relevant captions. So, expertise in the fields of computer vision paired with natural language processing are supposed to be crucial for this purpose. The sequence to sequence modelling strategy of deep neural networks is the traditional approach to generate a sequential list of words which are combined to represent the image. But these models suffer from the problem of high variance by not being able to generalize well on the training data.
The main focus of this thesis is to reduce the variance factor which will help in generating better captions. To achieve this, Ensemble Learning techniques have been explored, which have the reputation of solving the high variance problem that occurs in machine learning algorithms. Three different ensemble techniques namely, k-fold ensemble, bootstrap aggregation ensemble and boosting ensemble have been evaluated in this thesis. For each of these techniques, three output combination approaches have been analyzed. Extensive experiments have been conducted on the Flickr8k dataset which has a collection of 8000 images and 5 different captions for every image. The bleu score performance metric, which is considered to be the standard for evaluating natural language processing (NLP) problems, is used to evaluate the predictions. Based on this metric, the analysis shows that ensemble learning performs significantly better and generates more meaningful captions compared to any of the individual models used.
This thesis presents a modified traversal algorithm on dependency parse output of text to extract all subject predicate object pairs from text while ensuring that no information is missed out. To support full scale, all-purpose information extraction from large text corpuses, a data preprocessing pipeline is recommended to be used before the extraction is run. The output format is designed specifically to fit on a node-edge-node model and form the building blocks of a network which makes understanding of the text and querying of information from corpus quick and intuitive. It attempts to reduce reading time and enhancing understanding of the text using interactive graph and timeline.
are constantly changing, and adapting to these changes in an academic curriculum
can be challenging. Given a specific aspect of a domain, there can be various levels of
proficiency that can be achieved by the students. Considering the wide array of needs,
diverse groups need customized course curriculum. The need for having an archetype
to design a course focusing on the outcomes paved the way for Outcome-based
Education (OBE). OBE focuses on the outcomes as opposed to the traditional way of
following a process [23]. According to D. Clark, the major reason for the creation of
Bloom’s taxonomy was not only to stimulate and inspire a higher quality of thinking
in academia – incorporating not just the basic fact-learning and application, but also
to evaluate and analyze on the facts and its applications [7]. Instructional Module
Development System (IMODS) is the culmination of both these models – Bloom’s
Taxonomy and OBE. It is an open-source web-based software that has been
developed on the principles of OBE and Bloom’s Taxonomy. It guides an instructor,
step-by-step, through an outcomes-based process as they define the learning
objectives, the content to be covered and develop an instruction and assessment plan.
The tool also provides the user with a repository of techniques based on the choices
made by them regarding the level of learning while defining the objectives. This helps
in maintaining alignment among all the components of the course design. The tool
also generates documentation to support the course design and provide feedback
when the course is lacking in certain aspects.
It is not just enough to come up with a model that theoretically facilitates
effective result-oriented course design. There should be facts, experiments and proof
that any model succeeds in achieving what it aims to achieve. And thus, there are two
research objectives of this thesis: (i) design a feature for course design feedback and
evaluate its effectiveness; (ii) evaluate the usefulness of a tool like IMODS on various
aspects – (a) the effectiveness of the tool in educating instructors on OBE; (b) the
effectiveness of the tool in providing appropriate and efficient pedagogy and
assessment techniques; (c) the effectiveness of the tool in building the learning
objectives; (d) effectiveness of the tool in document generation; (e) Usability of the
tool; (f) the effectiveness of OBE on course design and expected student outcomes.
The thesis presents a detailed algorithm for course design feedback, its pseudocode, a
description and proof of the correctness of the feature, methods used for evaluation
of the tool, experiments for evaluation and analysis of the obtained results.