Search Content

Structured sparse learning and its applications to biomedical and biological data

Description

Sparsity has become an important modeling tool in areas such as genetics, signal and audio processing, medical image processing, etc. Via the penalization of l-1 norm based regularization, the structured sparse learning algorithms can produce highly accurate models while imposing various predefined structures on the data, such as feature groups…

Sparsity has become an important modeling tool in areas such as genetics, signal and audio processing, medical image processing, etc. Via the penalization of l-1 norm based regularization, the structured sparse learning algorithms can produce highly accurate models while imposing various predefined structures on the data, such as feature groups or graphs. In this thesis, I first propose to solve a sparse learning model with a general group structure, where the predefined groups may overlap with each other. Then, I present three real world applications which can benefit from the group structured sparse learning technique. In the first application, I study the Alzheimer's Disease diagnosis problem using multi-modality neuroimaging data. In this dataset, not every subject has all data sources available, exhibiting an unique and challenging block-wise missing pattern. In the second application, I study the automatic annotation and retrieval of fruit-fly gene expression pattern images. Combined with the spatial information, sparse learning techniques can be used to construct effective representation of the expression images. In the third application, I present a new computational approach to annotate developmental stage for Drosophila embryos in the gene expression images. In addition, it provides a stage score that enables one to more finely annotate each embryo so that they are divided into early and late periods of development within standard stage demarcations. Stage scores help us to illuminate global gene activities and changes much better, and more refined stage annotations improve our ability to better interpret results when expression pattern matches are discovered between genes.

ContributorsYuan, Lei (Author) / Ye, Jieping (Thesis advisor) / Wang, Yalin (Committee member) / Xue, Guoliang (Committee member) / Kumar, Sudhir (Committee member) / Arizona State University (Publisher)

Created2013

Batch mode active learning for multimedia pattern recognition

Description

The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of humongous amounts of digital data (in the form of images, videos and text). This has expanded the possibility of solving real world problems using computational learning frameworks. However, while gathering a…

The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of humongous amounts of digital data (in the form of images, videos and text). This has expanded the possibility of solving real world problems using computational learning frameworks. However, while gathering a large amount of data is cheap and easy, annotating them with class labels is an expensive process in terms of time, labor and human expertise. This has paved the way for research in the field of active learning. Such algorithms automatically select the salient and exemplar instances from large quantities of unlabeled data and are effective in reducing human labeling effort in inducing classification models. To utilize the possible presence of multiple labeling agents, there have been attempts towards a batch mode form of active learning, where a batch of data instances is selected simultaneously for manual annotation. This dissertation is aimed at the development of novel batch mode active learning algorithms to reduce manual effort in training classification models in real world multimedia pattern recognition applications. Four major contributions are proposed in this work: $(i)$ a framework for dynamic batch mode active learning, where the batch size and the specific data instances to be queried are selected adaptively through a single formulation, based on the complexity of the data stream in question, $(ii)$ a batch mode active learning strategy for fuzzy label classification problems, where there is an inherent imprecision and vagueness in the class label definitions, $(iii)$ batch mode active learning algorithms based on convex relaxations of an NP-hard integer quadratic programming (IQP) problem, with guaranteed bounds on the solution quality and $(iv)$ an active matrix completion algorithm and its application to solve several variants of the active learning problem (transductive active learning, multi-label active learning, active feature acquisition and active learning for regression). These contributions are validated on the face recognition and facial expression recognition problems (which are commonly encountered in real world applications like robotics, security and assistive technology for the blind and the visually impaired) and also on collaborative filtering applications like movie recommendation.

ContributorsChakraborty, Shayok (Author) / Panchanathan, Sethuraman (Thesis advisor) / Balasubramanian, Vineeth N. (Committee member) / Li, Baoxin (Committee member) / Mittelmann, Hans (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)

Created2013

Corporate mentors and undergraduate students: a qualitative study of the Advancing Women in Construction Mentorship Program

Description

In a conscious effort to combat the low enrollment of women in construction management, a program was created to retain women through a mentorship program - Advancing Women in Construction. A qualitative analysis, facilitated through a grounded theory approach, sought to understand if the program was indeed successful, and what…

In a conscious effort to combat the low enrollment of women in construction management, a program was created to retain women through a mentorship program - Advancing Women in Construction. A qualitative analysis, facilitated through a grounded theory approach, sought to understand if the program was indeed successful, and what value did the students derive from the programs and participating in the mentoring process.

ContributorsEicher, Matthew (Author) / Wilkinson, Christine Kajikawa (Thesis advisor) / Calleroz-White, Mistalene (Committee member) / Gibson, Jr., G. Edward (Committee member) / Arizona State University (Publisher)

Created2013

Teachers, texts, and transactions: towards a pedagogy for teaching literature

Description

A simple passion for reading compels many to enter the university literature classroom. What happens once they arrive may fuel that passion, or possibly destroy it. A romanticized relationship with literature proves to be an obstacle that hinders a deeper and richer engagement with texts. Primary research consisting of personal…

A simple passion for reading compels many to enter the university literature classroom. What happens once they arrive may fuel that passion, or possibly destroy it. A romanticized relationship with literature proves to be an obstacle that hinders a deeper and richer engagement with texts. Primary research consisting of personal interviews, observations, and surveys, form the source of data for this dissertation project which was designed to examine how literature teachers engage their students with texts, discussion, and assignments in the university setting. Traditionally text centered and resolute, literature courses will need refashioning if they are to advance beyond erstwhile conventions. The goal of this study is to create space for a dialogue about the need for a pedagogy of literature.

ContributorsSanchez, Shillana (Author) / Goggin, Maureen (Thesis advisor) / Tobin, Beth (Thesis advisor) / Rose, Shirley (Committee member) / Arizona State University (Publisher)

Created2013

Evaluation of a biofeedback intervention in college students diagnosed with autism spectrum disorders

Description

This study used exploratory data analysis (EDA) to examine the use of a biofeedback intervention in the treatment of anxiety for college students diagnosed with an Autism Spectrum Disorder (ASD) (n=10) and in a typical college population (n=37). The use of EDA allowed for trends to emerge from the data…

This study used exploratory data analysis (EDA) to examine the use of a biofeedback intervention in the treatment of anxiety for college students diagnosed with an Autism Spectrum Disorder (ASD) (n=10) and in a typical college population (n=37). The use of EDA allowed for trends to emerge from the data and provided a foundation for future research in the areas of biofeedback and accommodations for college students with ASD. Comparing the first five weeks of the study with the second five weeks of the 10 week study, both groups showed improvement in their control of heart rate variability, a physiological marker for anxiety used in biofeedback. The ASD group showed greater gains, more consistent gains, and less variability in raw scores than the typical group. EDA also revealed a pattern between participant attrition and a participant's biofeedback progress. Implications are discussed.

ContributorsWestlake, Garret (Author) / McCoy, Kathleen M. (Thesis advisor) / Brown, Jane T (Committee member) / DiGangi, Samuel A. (Committee member) / Caterino, Linda K (Committee member) / Arizona State University (Publisher)

Created2013

Exploring the impact of varying levels of augmented reality to teach probability and sampling with a mobile device

Description

Statistics is taught at every level of education, yet teachers often have to assume their students have no knowledge of statistics and start from scratch each time they set out to teach statistics. The motivation for this experimental study comes from interest in exploring educational applications of augmented reality (AR)…

Statistics is taught at every level of education, yet teachers often have to assume their students have no knowledge of statistics and start from scratch each time they set out to teach statistics. The motivation for this experimental study comes from interest in exploring educational applications of augmented reality (AR) delivered via mobile technology that could potentially provide rich, contextualized learning for understanding concepts related to statistics education. This study examined the effects of AR experiences for learning basic statistical concepts. Using a 3 x 2 research design, this study compared learning gains of 252 undergraduate and graduate students from a pre- and posttest given before and after interacting with one of three types of augmented reality experiences, a high AR experience (interacting with three dimensional images coupled with movement through a physical space), a low AR experience (interacting with three dimensional images without movement), or no AR experience (two dimensional images without movement). Two levels of collaboration (pairs and no pairs) were also included. Additionally, student perceptions toward collaboration opportunities and engagement were compared across the six treatment conditions. Other demographic information collected included the students' previous statistics experience, as well as their comfort level in using mobile devices. The moderating variables included prior knowledge (high, average, and low) as measured by the student's pretest score. Taking into account prior knowledge, students with low prior knowledge assigned to either high or low AR experience had statistically significant higher learning gains than those assigned to a no AR experience. On the other hand, the results showed no statistical significance between students assigned to work individually versus in pairs. Students assigned to both high and low AR experience perceived a statistically significant higher level of engagement than their no AR counterparts. Students with low prior knowledge benefited the most from the high AR condition in learning gains. Overall, the AR application did well for providing a hands-on experience working with statistical data. Further research on AR and its relationship to spatial cognition, situated learning, high order skill development, performance support, and other classroom applications for learning is still needed.

ContributorsConley, Quincy (Author) / Atkinson, Robert K (Thesis advisor) / Nguyen, Frank (Committee member) / Nelson, Brian C (Committee member) / Arizona State University (Publisher)

Created2013

Advancing biomedical named entity recognition with multivariate feature selection and semantically motivated features

Description

Automating aspects of biocuration through biomedical information extraction could significantly impact biomedical research by enabling greater biocuration throughput and improving the feasibility of a wider scope. An important step in biomedical information extraction systems is named entity recognition (NER), where mentions of entities such as proteins and diseases are located…

Automating aspects of biocuration through biomedical information extraction could significantly impact biomedical research by enabling greater biocuration throughput and improving the feasibility of a wider scope. An important step in biomedical information extraction systems is named entity recognition (NER), where mentions of entities such as proteins and diseases are located within natural-language text and their semantic type is determined. This step is critical for later tasks in an information extraction pipeline, including normalization and relationship extraction. BANNER is a benchmark biomedical NER system using linear-chain conditional random fields and the rich feature set approach. A case study with BANNER locating genes and proteins in biomedical literature is described. The first corpus for disease NER adequate for use as training data is introduced, and employed in a case study of disease NER. The first corpus locating adverse drug reactions (ADRs) in user posts to a health-related social website is also described, and a system to locate and identify ADRs in social media text is created and evaluated. The rich feature set approach to creating NER feature sets is argued to be subject to diminishing returns, implying that additional improvements may require more sophisticated methods for creating the feature set. This motivates the first application of multivariate feature selection with filters and false discovery rate analysis to biomedical NER, resulting in a feature set at least 3 orders of magnitude smaller than the set created by the rich feature set approach. Finally, two novel approaches to NER by modeling the semantics of token sequences are introduced. The first method focuses on the sequence content by using language models to determine whether a sequence resembles entries in a lexicon of entity names or text from an unlabeled corpus more closely. The second method models the distributional semantics of token sequences, determining the similarity between a potential mention and the token sequences from the training data by analyzing the contexts where each sequence appears in a large unlabeled corpus. The second method is shown to improve the performance of BANNER on multiple data sets.

ContributorsLeaman, James Robert (Author) / Gonzalez, Graciela (Thesis advisor) / Baral, Chitta (Thesis advisor) / Cohen, Kevin B (Committee member) / Liu, Huan (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)

Created2013

Representing chemistry: how instructional use of symbolic, microscopic and macroscopic mode influences student conceptual understanding in chemistry

Description

Chemistry as a subject is difficult to learn and understand, due in part to the specific language used by practitioners in their professional and scientific communications. The language and ways of representing chemical interactions have been grouped into three modes of representation used by chemistry instructors, and ultimately by students…

Chemistry as a subject is difficult to learn and understand, due in part to the specific language used by practitioners in their professional and scientific communications. The language and ways of representing chemical interactions have been grouped into three modes of representation used by chemistry instructors, and ultimately by students in understanding the discipline. The first of these three modes of representation is the symbolic mode, which uses a standard set of rules for chemical nomenclature set out by the IUPAC. The second mode of representation is that of microscopic, which depicts chemical compounds as discrete units made up of atoms and molecules, with a particular ratio of atoms to a molecule or formula unit. The third mode of representation is macroscopic, what can be seen, experienced, or measured directly, like ice melting or a color change during a chemical reaction. Recent evidence suggests that chemistry instructors can assist their students in making the connections between the modes of representation by incorporating all three modes into their teaching and discussions, and overtly connecting the modes during instruction. In this research, chemistry teachers at the community college level were observed over the course of an entire semester, to evaluate their instructional use of mode of representation. The students of these teachers were tested prior to and after a semester's worth of instruction, and changes in the basic chemistry conceptual knowledge of these students were compared. Additionally, a subset of the overall population that was pre- and post-tested was interviewed at length using demonstrations of chemical phenomenon that students were asked to translate using all three modes of representation. Analysis of the instruction of three community college teachers shows there were significant differences among these teachers in their instructional use of mode of representation. Additionally, the students of these three teachers had differential and statistically significant achievement over the course of the semester. This research supports results of other similar studies, as well as providing some unexpected results from the students involved.

ContributorsWood, Lorelei (Author) / Baker, Dale (Thesis advisor) / Ganesh, Tirupalavanam G. (Committee member) / Colleen, Megowan (Committee member) / Sujatha, Krishnaswamy (Committee member) / Arizona State University (Publisher)

Created2013

Biology faculty at large research institutions: the nature of their pedagogical content knowledge

Description

To address the need of scientists and engineers in the United States workforce and ensure that students in higher education become scientifically literate, research and policy has called for improvements in undergraduate education in the sciences. One particular pathway for improving undergraduate education in the science fields is to reform…

To address the need of scientists and engineers in the United States workforce and ensure that students in higher education become scientifically literate, research and policy has called for improvements in undergraduate education in the sciences. One particular pathway for improving undergraduate education in the science fields is to reform undergraduate teaching. Only a limited number of studies have explored the pedagogical content knowledge of postsecondary level teachers. This study was conducted to characterize the PCK of biology faculty and explore the factors influencing their PCK. Data included semi-structured interviews, classroom observations, documents, and instructional artifacts. A qualitative inquiry was designed to conduct an in-depth investigation focusing on the PCK of six biology instructors, particularly the types of knowledge they used for teaching biology, their perceptions of teaching, and the social interactions and experiences that influenced their PCK. The findings of this study reveal that the PCK of the biology faculty included eight domains of knowledge: (1) content, (2) context, (3) learners and learning, (4) curriculum, (5) instructional strategies, (6) representations of biology, (7) assessment, and (8) building rapport with students. Three categories of faculty PCK emerged: (1) PCK as an expert explainer, (2) PCK as an instructional architect, and (3) a transitional PCK, which fell between the two prior categories. Based on the interpretations of the data, four social interactions and experiences were found to influence biology faculty PCK: (1) teaching experience, (2) models and mentors, (3) collaborations about teaching, and (4) science education research. The varying teaching perspectives of the faculty also influenced their PCK. This study shows that the PCK of biology faculty for teaching large introductory courses at large research institutions is heavily influenced by factors beyond simply years of teaching experience and expert content knowledge. Social interactions and experiences created by the institution play a significant role in developing the PCK of biology faculty.

ContributorsHill, Kathleen M. (Author) / Luft, Julie A. (Thesis advisor) / Baker, Dale (Committee member) / Orchinik, Miles (Committee member) / Arizona State University (Publisher)

Created2013

Building adaptive computational systems for physiological and biomedical data

Description

In recent years, machine learning and data mining technologies have received growing attention in several areas such as recommendation systems, natural language processing, speech and handwriting recognition, image processing and biomedical domain. Many of these applications which deal with physiological and biomedical data require person specific or person adaptive systems.…

In recent years, machine learning and data mining technologies have received growing attention in several areas such as recommendation systems, natural language processing, speech and handwriting recognition, image processing and biomedical domain. Many of these applications which deal with physiological and biomedical data require person specific or person adaptive systems. The greatest challenge in developing such systems is the subject-dependent data variations or subject-based variability in physiological and biomedical data, which leads to difference in data distributions making the task of modeling these data, using traditional machine learning algorithms, complex and challenging. As a result, despite the wide application of machine learning, efficient deployment of its principles to model real-world data is still a challenge. This dissertation addresses the problem of subject based variability in physiological and biomedical data and proposes person adaptive prediction models based on novel transfer and active learning algorithms, an emerging field in machine learning. One of the significant contributions of this dissertation is a person adaptive method, for early detection of muscle fatigue using Surface Electromyogram signals, based on a new multi-source transfer learning algorithm. This dissertation also proposes a subject-independent algorithm for grading the progression of muscle fatigue from 0 to 1 level in a test subject, during isometric or dynamic contractions, at real-time. Besides subject based variability, biomedical image data also varies due to variations in their imaging techniques, leading to distribution differences between the image databases. Hence a classifier learned on one database may perform poorly on the other database. Another significant contribution of this dissertation has been the design and development of an efficient biomedical image data annotation framework, based on a novel combination of transfer learning and a new batch-mode active learning method, capable of addressing the distribution differences across databases. The methodologies developed in this dissertation are relevant and applicable to a large set of computing problems where there is a high variation of data between subjects or sources, such as face detection, pose detection and speech recognition. From a broader perspective, these frameworks can be viewed as a first step towards design of automated adaptive systems for real world data.

ContributorsChattopadhyay, Rita (Author) / Panchanathan, Sethuraman (Thesis advisor) / Ye, Jieping (Thesis advisor) / Li, Baoxin (Committee member) / Santello, Marco (Committee member) / Arizona State University (Publisher)

Created2013

Filtering by