ASU Electronic Theses and Dissertations
This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.
In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.
Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.
Filtering by
- Creators: Bansal, Ajay
This thesis presents a modified traversal algorithm on dependency parse output of text to extract all subject predicate object pairs from text while ensuring that no information is missed out. To support full scale, all-purpose information extraction from large text corpuses, a data preprocessing pipeline is recommended to be used before the extraction is run. The output format is designed specifically to fit on a node-edge-node model and form the building blocks of a network which makes understanding of the text and querying of information from corpus quick and intuitive. It attempts to reduce reading time and enhancing understanding of the text using interactive graph and timeline.
Question Answering systems have been around for quite some time and are a sub-field of information retrieval and natural language processing. The task of any Question Answering system is to seek an answer to a free form factual question. The difficulty of pinpointing and verifying the precise answer makes question answering more challenging than simple information retrieval done by search engines. Text REtrieval Conference (TREC) is a yearly conference which provides large - scale infrastructure and resources to support research in information retrieval domain. TREC has a question answering track since 1999 where the questions dataset contains a list of factual questions (Vorhees & Tice, 1999). DBpedia (Bizer et al., 2009) is a community driven effort to extract and structure the data present in Wikipedia.
The research objective of this thesis is to develop a novel approach to Question Answering based on a composition of conventional approaches of Information Retrieval and Natural Language processing. The focus is also on exploring the use of a structured and annotated knowledge base as opposed to an unstructured knowledge base. The knowledge base used here is DBpedia and the final system is evaluated on the TREC 2004 questions dataset.
especially in the field of education and research. Although many MDC applications are
available, almost all of them are tailor-made for a very specific task in a very specific
field (i.e. health, traffic, weather forecasts, …etc.). Since the main users of these apps are
researchers, physicians or generally data collectors, it can be extremely challenging for
them to make adjustments or modifications to these applications given that they have
limited or no technical background in coding. Another common issue with MDC
applications is that its functionalities are limited only to data collection and storing. Other
functionalities such as data visualizations, data sharing, data synchronization and/or data updating are rarely found in MDC apps.
This thesis tries to solve the problems mentioned above by adding the following
two enhancements: (a) the ability for data collectors to customize their own applications
based on the project they’re working on, (b) and introducing new tools that would help
manage the collected data. This will be achieved by creating a Java standalone
application where data collectors can use to design their own mobile apps in a userfriendly Graphical User Interface (GUI). Once the app has been completely designed
using the Java tool, a new iOS mobile application would be automatically generated
based on the user’s input. By using this tool, researchers now are able to create mobile
applications that are completely tailored to their needs, in addition to enjoying new
features such as visualize and analyze data, synchronize data to the remote database,
share data with other data collectors and update existing data.
The main focus of this thesis is to use visual description of a landmark by choosing the most diverse pictures that best describe all the details of the queried location from community-contributed datasets. For this, an end-to-end framework has been built, to retrieve relevant results that are also diverse. Different retrieval re-ranking and diversification strategies are evaluated to find a balance between relevance and diversification. Clustering techniques are employed to improve divergence. A unique fusion approach has been adopted to overcome the dilemma of selecting an appropriate clustering technique and the corresponding parameters, given a set of data to be investigated. Extensive experiments have been conducted on the Flickr Div150Cred dataset that has 30 different landmark locations. The results obtained are promising when evaluated on metrics for relevance and diversification.