Matching Items (12)
Filtering by

Clear all filters

149922-Thumbnail Image.png
Description
Bridging semantic gap is one of the fundamental problems in multimedia computing and pattern recognition. The challenge of associating low-level signal with their high-level semantic interpretation is mainly due to the fact that semantics are often conveyed implicitly in a context, relying on interactions among multiple levels of concepts or

Bridging semantic gap is one of the fundamental problems in multimedia computing and pattern recognition. The challenge of associating low-level signal with their high-level semantic interpretation is mainly due to the fact that semantics are often conveyed implicitly in a context, relying on interactions among multiple levels of concepts or low-level data entities. Also, additional domain knowledge may often be indispensable for uncovering the underlying semantics, but in most cases such domain knowledge is not readily available from the acquired media streams. Thus, making use of various types of contextual information and leveraging corresponding domain knowledge are vital for effectively associating high-level semantics with low-level signals with higher accuracies in multimedia computing problems. In this work, novel computational methods are explored and developed for incorporating contextual information/domain knowledge in different forms for multimedia computing and pattern recognition problems. Specifically, a novel Bayesian approach with statistical-sampling-based inference is proposed for incorporating a special type of domain knowledge, spatial prior for the underlying shapes; cross-modality correlations via Kernel Canonical Correlation Analysis is explored and the learnt space is then used for associating multimedia contents in different forms; model contextual information as a graph is leveraged for regulating interactions among high-level semantic concepts (e.g., category labels), low-level input signal (e.g., spatial/temporal structure). Four real-world applications, including visual-to-tactile face conversion, photo tag recommendation, wild web video classification and unconstrained consumer video summarization, are selected to demonstrate the effectiveness of the approaches. These applications range from classic research challenges to emerging tasks in multimedia computing. Results from experiments on large-scale real-world data with comparisons to other state-of-the-art methods and subjective evaluations with end users confirmed that the developed approaches exhibit salient advantages, suggesting that they are promising for leveraging contextual information/domain knowledge for a wide range of multimedia computing and pattern recognition problems.
ContributorsWang, Zhesheng (Author) / Li, Baoxin (Thesis advisor) / Sundaram, Hari (Committee member) / Qian, Gang (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)
Created2011
151627-Thumbnail Image.png
Description
Text classification, in the artificial intelligence domain, is an activity in which text documents are automatically classified into predefined categories using machine learning techniques. An example of this is classifying uncategorized news articles into different predefined categories such as "Business", "Politics", "Education", "Technology" , etc. In this thesis, supervised machine

Text classification, in the artificial intelligence domain, is an activity in which text documents are automatically classified into predefined categories using machine learning techniques. An example of this is classifying uncategorized news articles into different predefined categories such as "Business", "Politics", "Education", "Technology" , etc. In this thesis, supervised machine learning approach is followed, in which a module is first trained with pre-classified training data and then class of test data is predicted. Good feature extraction is an important step in the machine learning approach and hence the main component of this text classifier is semantic triplet based features in addition to traditional features like standard keyword based features and statistical features based on shallow-parsing (such as density of POS tags and named entities). Triplet {Subject, Verb, Object} in a sentence is defined as a relation between subject and object, the relation being the predicate (verb). Triplet extraction process, is a 5 step process which takes input corpus as a web text document(s), each consisting of one or many paragraphs, from RSS feeds to lists of extremist website. Input corpus feeds into the "Pronoun Resolution" step, which uses an heuristic approach to identify the noun phrases referenced by the pronouns. The next step "SRL Parser" is a shallow semantic parser and converts the incoming pronoun resolved paragraphs into annotated predicate argument format. The output of SRL parser is processed by "Triplet Extractor" algorithm which forms the triplet in the form {Subject, Verb, Object}. Generalization and reduction of triplet features is the next step. Reduced feature representation reduces computing time, yields better discriminatory behavior and handles curse of dimensionality phenomena. For training and testing, a ten- fold cross validation approach is followed. In each round SVM classifier is trained with 90% of labeled (training) data and in the testing phase, classes of remaining 10% unlabeled (testing) data are predicted. Concluding, this paper proposes a model with semantic triplet based features for story classification. The effectiveness of the model is demonstrated against other traditional features used in the literature for text classification tasks.
ContributorsKarad, Ravi Chandravadan (Author) / Davulcu, Hasan (Thesis advisor) / Corman, Steven (Committee member) / Sen, Arunabha (Committee member) / Arizona State University (Publisher)
Created2013
149607-Thumbnail Image.png
Description
In the current millennium, extensive use of computers and the internet caused an exponential increase in information. Few research areas are as important as information extraction, which primarily involves extracting concepts and the relations between them from free text. Limitations in the size of training data, lack of lexicons and

In the current millennium, extensive use of computers and the internet caused an exponential increase in information. Few research areas are as important as information extraction, which primarily involves extracting concepts and the relations between them from free text. Limitations in the size of training data, lack of lexicons and lack of relationship patterns are major factors for poor performance in information extraction. This is because the training data cannot possibly contain all concepts and their synonyms; and it contains only limited examples of relationship patterns between concepts. Creating training data, lexicons and relationship patterns is expensive, especially in the biomedical domain (including clinical notes) because of the depth of domain knowledge required of the curators. Dictionary-based approaches for concept extraction in this domain are not sufficient to effectively overcome the complexities that arise because of the descriptive nature of human languages. For example, there is a relatively higher amount of abbreviations (not all of them present in lexicons) compared to everyday English text. Sometimes abbreviations are modifiers of an adjective (e.g. CD4-negative) rather than nouns (and hence, not usually considered named entities). There are many chemical names with numbers, commas, hyphens and parentheses (e.g. t(3;3)(q21;q26)), which will be separated by most tokenizers. In addition, partial words are used in place of full words (e.g. up- and downregulate); and some of the words used are highly specialized for the domain. Clinical notes contain peculiar drug names, anatomical nomenclature, other specialized names and phrases that are not standard in everyday English or in published articles (e.g. "l shoulder inj"). State of the art concept extraction systems use machine learning algorithms to overcome some of these challenges. However, they need a large annotated corpus for every concept class that needs to be extracted. A novel natural language processing approach to minimize this limitation in concept extraction is proposed here using distributional semantics. Distributional semantics is an emerging field arising from the notion that the meaning or semantics of a piece of text (discourse) depends on the distribution of the elements of that discourse in relation to its surroundings. Distributional information from large unlabeled data is used to automatically create lexicons for the concepts to be tagged, clusters of contextually similar words, and thesauri of distributionally similar words. These automatically generated lexical resources are shown here to be more useful than manually created lexicons for extracting concepts from both literature and narratives. Further, machine learning features based on distributional semantics are shown to improve the accuracy of BANNER, and could be used in other machine learning systems such as cTakes to improve their performance. In addition, in order to simplify the sentence patterns and facilitate association extraction, a new algorithm using a "shotgun" approach is proposed. The goal of sentence simplification has traditionally been to reduce the grammatical complexity of sentences while retaining the relevant information content and meaning to enable better readability for humans and enhanced processing by parsers. Sentence simplification is shown here to improve the performance of association extraction systems for both biomedical literature and clinical notes. It helps improve the accuracy of protein-protein interaction extraction from the literature and also improves relationship extraction from clinical notes (such as between medical problems, tests and treatments). Overall, the two main contributions of this work include the application of sentence simplification to association extraction as described above, and the use of distributional semantics for concept extraction. The proposed work on concept extraction amalgamates for the first time two diverse research areas -distributional semantics and information extraction. This approach renders all the advantages offered in other semi-supervised machine learning systems, and, unlike other proposed semi-supervised approaches, it can be used on top of different basic frameworks and algorithms.
ContributorsJonnalagadda, Siddhartha Reddy (Author) / Gonzalez, Graciela H (Thesis advisor) / Cohen, Trevor A (Committee member) / Greenes, Robert A (Committee member) / Fridsma, Douglas B (Committee member) / Arizona State University (Publisher)
Created2011
149307-Thumbnail Image.png
Description
Continuous advancements in biomedical research have resulted in the production of vast amounts of scientific data and literature discussing them. The ultimate goal of computational biology is to translate these large amounts of data into actual knowledge of the complex biological processes and accurate life science models. The ability to

Continuous advancements in biomedical research have resulted in the production of vast amounts of scientific data and literature discussing them. The ultimate goal of computational biology is to translate these large amounts of data into actual knowledge of the complex biological processes and accurate life science models. The ability to rapidly and effectively survey the literature is necessary for the creation of large scale models of the relationships among biomedical entities as well as hypothesis generation to guide biomedical research. To reduce the effort and time spent in performing these activities, an intelligent search system is required. Even though many systems aid in navigating through this wide collection of documents, the vastness and depth of this information overload can be overwhelming. An automated extraction system coupled with a cognitive search and navigation service over these document collections would not only save time and effort, but also facilitate discovery of the unknown information implicitly conveyed in the texts. This thesis presents the different approaches used for large scale biomedical named entity recognition, and the challenges faced in each. It also proposes BioEve: an integrative framework to fuse a faceted search with information extraction to provide a search service that addresses the user's desire for "completeness" of the query results, not just the top-ranked ones. This information extraction system enables discovery of important semantic relationships between entities such as genes, diseases, drugs, and cell lines and events from biomedical text on MEDLINE, which is the largest publicly available database of the world's biomedical journal literature. It is an innovative search and discovery service that makes it easier to search
avigate and discover knowledge hidden in life sciences literature. To demonstrate the utility of this system, this thesis also details a prototype enterprise quality search and discovery service that helps researchers with a guided step-by-step query refinement, by suggesting concepts enriched in intermediate results, and thereby facilitating the "discover more as you search" paradigm.
ContributorsKanwar, Pradeep (Author) / Davulcu, Hasan (Thesis advisor) / Dinu, Valentin (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)
Created2010
154663-Thumbnail Image.png
Description
Text mining of biomedical literature and clinical notes is a very active field of research in biomedical science. Semantic analysis is one of the core modules for different Natural Language Processing (NLP) solutions. Methods for calculating semantic relatedness of two concepts can be very useful in solutions solving different problems

Text mining of biomedical literature and clinical notes is a very active field of research in biomedical science. Semantic analysis is one of the core modules for different Natural Language Processing (NLP) solutions. Methods for calculating semantic relatedness of two concepts can be very useful in solutions solving different problems such as relationship extraction, ontology creation and question / answering [1–6]. Several techniques exist in calculating semantic relatedness of two concepts. These techniques utilize different knowledge sources and corpora. So far, researchers attempted to find the best hybrid method for each domain by combining semantic relatedness techniques and data sources manually. In this work, attempts were made to eliminate the needs for manually combining semantic relatedness methods targeting any new contexts or resources through proposing an automated method, which attempted to find the best combination of semantic relatedness techniques and resources to achieve the best semantic relatedness score in every context. This may help the research community find the best hybrid method for each context considering the available algorithms and resources.
ContributorsEmadzadeh, Ehsan (Author) / Gonzalez, Graciela (Thesis advisor) / Greenes, Robert (Committee member) / Scotch, Matthew (Committee member) / Arizona State University (Publisher)
Created2016
154849-Thumbnail Image.png
Description
In this thesis multiple approaches are explored to enhance sentiment analysis of tweets. A standard sentiment analysis model with customized features is first trained and tested to establish a baseline. This is compared to an existing topic based mixture model and a new proposed topic based vector model both of

In this thesis multiple approaches are explored to enhance sentiment analysis of tweets. A standard sentiment analysis model with customized features is first trained and tested to establish a baseline. This is compared to an existing topic based mixture model and a new proposed topic based vector model both of which use Latent Dirichlet Allocation (LDA) for topic modeling. The proposed topic based vector model has higher accuracies in terms of averaged F scores than the other two models.
ContributorsBaskaran, Swetha (Author) / Davulcu, Hasan (Thesis advisor) / Sen, Arunabha (Committee member) / Hsiao, Ihan (Committee member) / Arizona State University (Publisher)
Created2016
154464-Thumbnail Image.png
Description
The rapid growth of social media in recent years provides a large amount of user-generated visual objects, e.g., images and videos. Advanced semantic understanding approaches on such visual objects are desired to better serve applications such as human-machine interaction, image retrieval, etc. Semantic visual attributes have been proposed and utilized

The rapid growth of social media in recent years provides a large amount of user-generated visual objects, e.g., images and videos. Advanced semantic understanding approaches on such visual objects are desired to better serve applications such as human-machine interaction, image retrieval, etc. Semantic visual attributes have been proposed and utilized in multiple visual computing tasks to bridge the so-called "semantic gap" between extractable low-level feature representations and high-level semantic understanding of the visual objects.

Despite years of research, there are still some unsolved problems on semantic attribute learning. First, real-world applications usually involve hundreds of attributes which requires great effort to acquire sufficient amount of labeled data for model learning. Second, existing attribute learning work for visual objects focuses primarily on images, with semantic analysis on videos left largely unexplored.

In this dissertation I conduct innovative research and propose novel approaches to tackling the aforementioned problems. In particular, I propose robust and accurate learning frameworks on both attribute ranking and prediction by exploring the correlation among multiple attributes and utilizing various types of label information. Furthermore, I propose a video-based skill coaching framework by extending attribute learning to the video domain for robust motion skill analysis. Experiments on various types of applications and datasets and comparisons with multiple state-of-the-art baseline approaches confirm that my proposed approaches can achieve significant performance improvements for the general attribute learning problem.
ContributorsChen, Lin (Author) / Li, Baoxin (Thesis advisor) / Turaga, Pavan (Committee member) / Wang, Yalin (Committee member) / Liu, Huan (Committee member) / Arizona State University (Publisher)
Created2016
154545-Thumbnail Image.png
Description
Many neurological disorders, especially those that result in dementia, impact speech and language production. A number of studies have shown that there exist subtle changes in linguistic complexity in these individuals that precede disease onset. However, these studies are conducted on controlled speech samples from a specific task. This thesis

Many neurological disorders, especially those that result in dementia, impact speech and language production. A number of studies have shown that there exist subtle changes in linguistic complexity in these individuals that precede disease onset. However, these studies are conducted on controlled speech samples from a specific task. This thesis explores the possibility of using natural language processing in order to detect declining linguistic complexity from more natural discourse. We use existing data from public figures suspected (or at risk) of suffering from cognitive-linguistic decline, downloaded from the Internet, to detect changes in linguistic complexity. In particular, we focus on two case studies. The first case study analyzes President Ronald Reagan’s transcribed spontaneous speech samples during his presidency. President Reagan was diagnosed with Alzheimer’s disease in 1994, however my results showed declining linguistic complexity during the span of the 8 years he was in office. President George Herbert Walker Bush, who has no known diagnosis of Alzheimer’s disease, shows no decline in the same measures. In the second case study, we analyze transcribed spontaneous speech samples from the news conferences of 10 current NFL players and 18 non-player personnel since 2007. The non-player personnel have never played professional football. Longitudinal analysis of linguistic complexity showed contrasting patterns in the two groups. The majority (6 of 10) of current players showed decline in at least one measure of linguistic complexity over time. In contrast, the majority (11 out of 18) of non-player personnel showed an increase in at least one linguistic complexity measure.
ContributorsWang, Shuai (Author) / Berisha, Visar (Thesis advisor) / LaCross, Amy (Committee member) / Tong, Hanghang (Committee member) / Arizona State University (Publisher)
Created2016
155085-Thumbnail Image.png
Description
High-level inference tasks in video applications such as recognition, video retrieval, and zero-shot classification have become an active research area in recent years. One fundamental requirement for such applications is to extract high-quality features that maintain high-level information in the videos.

Many video feature extraction algorithms have been purposed, such

High-level inference tasks in video applications such as recognition, video retrieval, and zero-shot classification have become an active research area in recent years. One fundamental requirement for such applications is to extract high-quality features that maintain high-level information in the videos.

Many video feature extraction algorithms have been purposed, such as STIP, HOG3D, and Dense Trajectories. These algorithms are often referred to as “handcrafted” features as they were deliberately designed based on some reasonable considerations. However, these algorithms may fail when dealing with high-level tasks or complex scene videos. Due to the success of using deep convolution neural networks (CNNs) to extract global representations for static images, researchers have been using similar techniques to tackle video contents. Typical techniques first extract spatial features by processing raw images using deep convolution architectures designed for static image classifications. Then simple average, concatenation or classifier-based fusion/pooling methods are applied to the extracted features. I argue that features extracted in such ways do not acquire enough representative information since videos, unlike images, should be characterized as a temporal sequence of semantically coherent visual contents and thus need to be represented in a manner considering both semantic and spatio-temporal information.

In this thesis, I propose a novel architecture to learn semantic spatio-temporal embedding for videos to support high-level video analysis. The proposed method encodes video spatial and temporal information separately by employing a deep architecture consisting of two channels of convolutional neural networks (capturing appearance and local motion) followed by their corresponding Fully Connected Gated Recurrent Unit (FC-GRU) encoders for capturing longer-term temporal structure of the CNN features. The resultant spatio-temporal representation (a vector) is used to learn a mapping via a Fully Connected Multilayer Perceptron (FC-MLP) to the word2vec semantic embedding space, leading to a semantic interpretation of the video vector that supports high-level analysis. I evaluate the usefulness and effectiveness of this new video representation by conducting experiments on action recognition, zero-shot video classification, and semantic video retrieval (word-to-video) retrieval, using the UCF101 action recognition dataset.
ContributorsHu, Sheng-Hung (Author) / Li, Baoxin (Thesis advisor) / Turaga, Pavan (Committee member) / Liang, Jianming (Committee member) / Tong, Hanghang (Committee member) / Arizona State University (Publisher)
Created2016
154605-Thumbnail Image.png
Description
With the advent of Massive Open Online Courses (MOOCs) educators have the opportunity to collect data from students and use it to derive insightful information about the students. Specifically, for programming based courses the ability to identify the specific areas or topics that need more attention from the students can

With the advent of Massive Open Online Courses (MOOCs) educators have the opportunity to collect data from students and use it to derive insightful information about the students. Specifically, for programming based courses the ability to identify the specific areas or topics that need more attention from the students can be of immense help. But the majority of traditional, non-virtual classes lack the ability to uncover such information that can serve as a feedback to the effectiveness of teaching. In majority of the schools paper exams and assignments provide the only form of assessment to measure the success of the students in achieving the course objectives. The overall grade obtained in paper exams and assignments need not present a complete picture of a student’s strengths and weaknesses. In part, this can be addressed by incorporating research-based technology into the classrooms to obtain real-time updates on students' progress. But introducing technology to provide real-time, class-wide engagement involves a considerable investment both academically and financially. This prevents the adoption of such technology thereby preventing the ideal, technology-enabled classrooms. With increasing class sizes, it is becoming impossible for teachers to keep a persistent track of their students progress and to provide personalized feedback. What if we can we provide technology support without adding more burden to the existing pedagogical approach? How can we enable semantic enrichment of exams that can translate to students' understanding of the topics taught in the class? Can we provide feedback to students that goes beyond only numbers and reveal areas that need their focus. In this research I focus on bringing the capability of conducting insightful analysis to paper exams with a less intrusive learning analytics approach that taps into the generic classrooms with minimum technology introduction. Specifically, the work focuses on automatic indexing of programming exam questions with ontological semantics. The thesis also focuses on designing and evaluating a novel semantic visual analytics suite for in-depth course monitoring. By visualizing the semantic information to illustrate the areas that need a student’s focus and enable teachers to visualize class level progress, the system provides a richer feedback to both sides for improvement.
ContributorsPandhalkudi Govindarajan, Sesha Kumar (Author) / Hsiao, I-Han (Thesis advisor) / Nelson, Brian (Committee member) / Walker, Erin (Committee member) / Arizona State University (Publisher)
Created2016