Matching Items (543)
Filtering by

Clear all filters

Description
Speciation is the fundamental process that has generated the vast diversity of life on earth. The hallmark of speciation is the evolution of barriers to gene flow. These barriers may reduce gene flow either by keeping incipient species from hybridizing at all (pre-zygotic), or by reducing the fitness of hybrids

Speciation is the fundamental process that has generated the vast diversity of life on earth. The hallmark of speciation is the evolution of barriers to gene flow. These barriers may reduce gene flow either by keeping incipient species from hybridizing at all (pre-zygotic), or by reducing the fitness of hybrids (post-zygotic). To understand the genetic architecture of these barriers and how they evolve, I studied a genus of wasps that exhibits barriers to gene flow that act both pre- and post-zygotically. Nasonia is a genus of four species of parasitoid wasps that can be hybridized in the laboratory. When two of these species, N. vitripennis and N. giraulti are mated, their offspring suffer, depending on the generation and cross examined, up to 80% mortality during larval development due to incompatible genic interactions between their nuclear and mitochondrial genomes. These species also exhibit pre-zygotic isolation, meaning they are more likely to mate with their own species when given the choice. I examined these two species and their hybrids to determine the genetic and physiological bases of both speciation mechanisms and to understand the evolutionary forces leading to them. I present results that indicate that the oxidative phosphorylation (OXPHOS) pathway, an essential pathway that is responsible for mitochondrial energy generation, is impaired in hybrids of these two species. These results indicate that this impairment is due to the unique evolutionary dynamics of the combined nuclear and mitochondrial origin of this pathway. I also present results showing that, as larvae, these hybrids experience retarded growth linked to the previously observed mortality and I explore possible physiological mechanisms for this. Finally, I show that the pre-mating isolation is due to a change in a single pheromone component in N. vitripennis males, that this change is under simple genetic control, and that it evolved neutrally before being co-opted as a species recognition signal. These results are an important addition to our overall understanding of the mechanisms of speciation and showcase Nasonia as an emerging model for the study of the genetics of speciation.
ContributorsGibson, Joshua D (Author) / Gadau, Jürgen (Thesis advisor) / Harrison, Jon (Committee member) / Pratt, Stephen (Committee member) / Verrelli, Brian (Committee member) / Willis, Wayne (Committee member) / Arizona State University (Publisher)
Created2013
151689-Thumbnail Image.png
Description
Sparsity has become an important modeling tool in areas such as genetics, signal and audio processing, medical image processing, etc. Via the penalization of l-1 norm based regularization, the structured sparse learning algorithms can produce highly accurate models while imposing various predefined structures on the data, such as feature groups

Sparsity has become an important modeling tool in areas such as genetics, signal and audio processing, medical image processing, etc. Via the penalization of l-1 norm based regularization, the structured sparse learning algorithms can produce highly accurate models while imposing various predefined structures on the data, such as feature groups or graphs. In this thesis, I first propose to solve a sparse learning model with a general group structure, where the predefined groups may overlap with each other. Then, I present three real world applications which can benefit from the group structured sparse learning technique. In the first application, I study the Alzheimer's Disease diagnosis problem using multi-modality neuroimaging data. In this dataset, not every subject has all data sources available, exhibiting an unique and challenging block-wise missing pattern. In the second application, I study the automatic annotation and retrieval of fruit-fly gene expression pattern images. Combined with the spatial information, sparse learning techniques can be used to construct effective representation of the expression images. In the third application, I present a new computational approach to annotate developmental stage for Drosophila embryos in the gene expression images. In addition, it provides a stage score that enables one to more finely annotate each embryo so that they are divided into early and late periods of development within standard stage demarcations. Stage scores help us to illuminate global gene activities and changes much better, and more refined stage annotations improve our ability to better interpret results when expression pattern matches are discovered between genes.
ContributorsYuan, Lei (Author) / Ye, Jieping (Thesis advisor) / Wang, Yalin (Committee member) / Xue, Guoliang (Committee member) / Kumar, Sudhir (Committee member) / Arizona State University (Publisher)
Created2013
151716-Thumbnail Image.png
Description
The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of humongous amounts of digital data (in the form of images, videos and text). This has expanded the possibility of solving real world problems using computational learning frameworks. However, while gathering a

The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of humongous amounts of digital data (in the form of images, videos and text). This has expanded the possibility of solving real world problems using computational learning frameworks. However, while gathering a large amount of data is cheap and easy, annotating them with class labels is an expensive process in terms of time, labor and human expertise. This has paved the way for research in the field of active learning. Such algorithms automatically select the salient and exemplar instances from large quantities of unlabeled data and are effective in reducing human labeling effort in inducing classification models. To utilize the possible presence of multiple labeling agents, there have been attempts towards a batch mode form of active learning, where a batch of data instances is selected simultaneously for manual annotation. This dissertation is aimed at the development of novel batch mode active learning algorithms to reduce manual effort in training classification models in real world multimedia pattern recognition applications. Four major contributions are proposed in this work: $(i)$ a framework for dynamic batch mode active learning, where the batch size and the specific data instances to be queried are selected adaptively through a single formulation, based on the complexity of the data stream in question, $(ii)$ a batch mode active learning strategy for fuzzy label classification problems, where there is an inherent imprecision and vagueness in the class label definitions, $(iii)$ batch mode active learning algorithms based on convex relaxations of an NP-hard integer quadratic programming (IQP) problem, with guaranteed bounds on the solution quality and $(iv)$ an active matrix completion algorithm and its application to solve several variants of the active learning problem (transductive active learning, multi-label active learning, active feature acquisition and active learning for regression). These contributions are validated on the face recognition and facial expression recognition problems (which are commonly encountered in real world applications like robotics, security and assistive technology for the blind and the visually impaired) and also on collaborative filtering applications like movie recommendation.
ContributorsChakraborty, Shayok (Author) / Panchanathan, Sethuraman (Thesis advisor) / Balasubramanian, Vineeth N. (Committee member) / Li, Baoxin (Committee member) / Mittelmann, Hans (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)
Created2013
152152-Thumbnail Image.png
Description
The academic literature on science communication widely acknowledges a problem: science communication between experts and lay audiences is important, but it is not done well. General audience popular science books, however, carry a reputation for clear science communication and are understudied in the academic literature. For this doctoral dissertation, I

The academic literature on science communication widely acknowledges a problem: science communication between experts and lay audiences is important, but it is not done well. General audience popular science books, however, carry a reputation for clear science communication and are understudied in the academic literature. For this doctoral dissertation, I utilize Sam Harris's The Moral Landscape, a general audience science book on the particularly thorny topic of neuroscientific approaches to morality, as a case-study to explore the possibility of using general audience science books as models for science communication more broadly. I conduct a literary analysis of the text that delimits the scope of its project, its intended audience, and the domains of science to be communicated. I also identify seven literary aspects of the text: three positive aspects that facilitate clarity and four negative aspects that interfere with lay public engagement. I conclude that The Moral Landscape relies on an assumed knowledge base and intuitions of its audience that cannot reasonably be expected of lay audiences; therefore, it cannot properly be construed as popular science communication. It nevertheless contains normative lessons for the broader science project, both in literary aspects to be salvaged and literary aspects and concepts to consciously be avoided and combated. I note that The Moral Landscape's failings can also be taken as an indication that typical descriptions of science communication offer under-detailed taxonomies of both audiences for science communication and the varieties of science communication aimed at those audiences. Future directions of study include rethinking appropriate target audiences for science literacy projects and developing a more discriminating taxonomy of both science communication and lay publics.
ContributorsJohnson, Nathan W (Author) / Robert, Jason S (Thesis advisor) / Creath, Richard (Committee member) / Martinez, Jacqueline (Committee member) / Sylvester, Edward (Committee member) / Lynch, John (Committee member) / Arizona State University (Publisher)
Created2013
152156-Thumbnail Image.png
Description
Once perceived as an unimportant occurrence in living organisms, cell degeneration was reconfigured as an important biological phenomenon in development, aging, health, and diseases in the twentieth century. This dissertation tells a twentieth-century history of scientific investigations on cell degeneration, including cell death and aging. By describing four central developments

Once perceived as an unimportant occurrence in living organisms, cell degeneration was reconfigured as an important biological phenomenon in development, aging, health, and diseases in the twentieth century. This dissertation tells a twentieth-century history of scientific investigations on cell degeneration, including cell death and aging. By describing four central developments in cell degeneration research with the four major chapters, I trace the emergence of the degenerating cell as a scientific object, describe the generations of a variety of concepts, interpretations and usages associated with cell death and aging, and analyze the transforming influences of the rising cell degeneration research. Particularly, the four chapters show how the changing scientific practices about cellular life in embryology, cell culture, aging research, and molecular biology of Caenorhabditis elegans shaped the interpretations about cell degeneration in the twentieth-century as life-shaping, limit-setting, complex, yet regulated. These events created and consolidated important concepts in life sciences such as programmed cell death, the Hayflick limit, apoptosis, and death genes. These cases also transformed the material and epistemic practices about the end of cellular life subsequently and led to the formations of new research communities. The four cases together show the ways cell degeneration became a shared subject between molecular cell biology, developmental biology, gerontology, oncology, and pathology of degenerative diseases. These practices and perspectives created a special kind of interconnectivity between different fields and led to a level of interdisciplinarity within cell degeneration research by the early 1990s.
ContributorsJiang, Lijing (Author) / Maienschein, Jane (Thesis advisor) / Laubichler, Manfred (Thesis advisor) / Hurlbut, James (Committee member) / Creath, Richard (Committee member) / White, Michael (Committee member) / Arizona State University (Publisher)
Created2013
152244-Thumbnail Image.png
Description
Statistics is taught at every level of education, yet teachers often have to assume their students have no knowledge of statistics and start from scratch each time they set out to teach statistics. The motivation for this experimental study comes from interest in exploring educational applications of augmented reality (AR)

Statistics is taught at every level of education, yet teachers often have to assume their students have no knowledge of statistics and start from scratch each time they set out to teach statistics. The motivation for this experimental study comes from interest in exploring educational applications of augmented reality (AR) delivered via mobile technology that could potentially provide rich, contextualized learning for understanding concepts related to statistics education. This study examined the effects of AR experiences for learning basic statistical concepts. Using a 3 x 2 research design, this study compared learning gains of 252 undergraduate and graduate students from a pre- and posttest given before and after interacting with one of three types of augmented reality experiences, a high AR experience (interacting with three dimensional images coupled with movement through a physical space), a low AR experience (interacting with three dimensional images without movement), or no AR experience (two dimensional images without movement). Two levels of collaboration (pairs and no pairs) were also included. Additionally, student perceptions toward collaboration opportunities and engagement were compared across the six treatment conditions. Other demographic information collected included the students' previous statistics experience, as well as their comfort level in using mobile devices. The moderating variables included prior knowledge (high, average, and low) as measured by the student's pretest score. Taking into account prior knowledge, students with low prior knowledge assigned to either high or low AR experience had statistically significant higher learning gains than those assigned to a no AR experience. On the other hand, the results showed no statistical significance between students assigned to work individually versus in pairs. Students assigned to both high and low AR experience perceived a statistically significant higher level of engagement than their no AR counterparts. Students with low prior knowledge benefited the most from the high AR condition in learning gains. Overall, the AR application did well for providing a hands-on experience working with statistical data. Further research on AR and its relationship to spatial cognition, situated learning, high order skill development, performance support, and other classroom applications for learning is still needed.
ContributorsConley, Quincy (Author) / Atkinson, Robert K (Thesis advisor) / Nguyen, Frank (Committee member) / Nelson, Brian C (Committee member) / Arizona State University (Publisher)
Created2013
151867-Thumbnail Image.png
Description
Automating aspects of biocuration through biomedical information extraction could significantly impact biomedical research by enabling greater biocuration throughput and improving the feasibility of a wider scope. An important step in biomedical information extraction systems is named entity recognition (NER), where mentions of entities such as proteins and diseases are located

Automating aspects of biocuration through biomedical information extraction could significantly impact biomedical research by enabling greater biocuration throughput and improving the feasibility of a wider scope. An important step in biomedical information extraction systems is named entity recognition (NER), where mentions of entities such as proteins and diseases are located within natural-language text and their semantic type is determined. This step is critical for later tasks in an information extraction pipeline, including normalization and relationship extraction. BANNER is a benchmark biomedical NER system using linear-chain conditional random fields and the rich feature set approach. A case study with BANNER locating genes and proteins in biomedical literature is described. The first corpus for disease NER adequate for use as training data is introduced, and employed in a case study of disease NER. The first corpus locating adverse drug reactions (ADRs) in user posts to a health-related social website is also described, and a system to locate and identify ADRs in social media text is created and evaluated. The rich feature set approach to creating NER feature sets is argued to be subject to diminishing returns, implying that additional improvements may require more sophisticated methods for creating the feature set. This motivates the first application of multivariate feature selection with filters and false discovery rate analysis to biomedical NER, resulting in a feature set at least 3 orders of magnitude smaller than the set created by the rich feature set approach. Finally, two novel approaches to NER by modeling the semantics of token sequences are introduced. The first method focuses on the sequence content by using language models to determine whether a sequence resembles entries in a lexicon of entity names or text from an unlabeled corpus more closely. The second method models the distributional semantics of token sequences, determining the similarity between a potential mention and the token sequences from the training data by analyzing the contexts where each sequence appears in a large unlabeled corpus. The second method is shown to improve the performance of BANNER on multiple data sets.
ContributorsLeaman, James Robert (Author) / Gonzalez, Graciela (Thesis advisor) / Baral, Chitta (Thesis advisor) / Cohen, Kevin B (Committee member) / Liu, Huan (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)
Created2013
152055-Thumbnail Image.png
Description
To address the need of scientists and engineers in the United States workforce and ensure that students in higher education become scientifically literate, research and policy has called for improvements in undergraduate education in the sciences. One particular pathway for improving undergraduate education in the science fields is to reform

To address the need of scientists and engineers in the United States workforce and ensure that students in higher education become scientifically literate, research and policy has called for improvements in undergraduate education in the sciences. One particular pathway for improving undergraduate education in the science fields is to reform undergraduate teaching. Only a limited number of studies have explored the pedagogical content knowledge of postsecondary level teachers. This study was conducted to characterize the PCK of biology faculty and explore the factors influencing their PCK. Data included semi-structured interviews, classroom observations, documents, and instructional artifacts. A qualitative inquiry was designed to conduct an in-depth investigation focusing on the PCK of six biology instructors, particularly the types of knowledge they used for teaching biology, their perceptions of teaching, and the social interactions and experiences that influenced their PCK. The findings of this study reveal that the PCK of the biology faculty included eight domains of knowledge: (1) content, (2) context, (3) learners and learning, (4) curriculum, (5) instructional strategies, (6) representations of biology, (7) assessment, and (8) building rapport with students. Three categories of faculty PCK emerged: (1) PCK as an expert explainer, (2) PCK as an instructional architect, and (3) a transitional PCK, which fell between the two prior categories. Based on the interpretations of the data, four social interactions and experiences were found to influence biology faculty PCK: (1) teaching experience, (2) models and mentors, (3) collaborations about teaching, and (4) science education research. The varying teaching perspectives of the faculty also influenced their PCK. This study shows that the PCK of biology faculty for teaching large introductory courses at large research institutions is heavily influenced by factors beyond simply years of teaching experience and expert content knowledge. Social interactions and experiences created by the institution play a significant role in developing the PCK of biology faculty.
ContributorsHill, Kathleen M. (Author) / Luft, Julie A. (Thesis advisor) / Baker, Dale (Committee member) / Orchinik, Miles (Committee member) / Arizona State University (Publisher)
Created2013
151926-Thumbnail Image.png
Description
In recent years, machine learning and data mining technologies have received growing attention in several areas such as recommendation systems, natural language processing, speech and handwriting recognition, image processing and biomedical domain. Many of these applications which deal with physiological and biomedical data require person specific or person adaptive systems.

In recent years, machine learning and data mining technologies have received growing attention in several areas such as recommendation systems, natural language processing, speech and handwriting recognition, image processing and biomedical domain. Many of these applications which deal with physiological and biomedical data require person specific or person adaptive systems. The greatest challenge in developing such systems is the subject-dependent data variations or subject-based variability in physiological and biomedical data, which leads to difference in data distributions making the task of modeling these data, using traditional machine learning algorithms, complex and challenging. As a result, despite the wide application of machine learning, efficient deployment of its principles to model real-world data is still a challenge. This dissertation addresses the problem of subject based variability in physiological and biomedical data and proposes person adaptive prediction models based on novel transfer and active learning algorithms, an emerging field in machine learning. One of the significant contributions of this dissertation is a person adaptive method, for early detection of muscle fatigue using Surface Electromyogram signals, based on a new multi-source transfer learning algorithm. This dissertation also proposes a subject-independent algorithm for grading the progression of muscle fatigue from 0 to 1 level in a test subject, during isometric or dynamic contractions, at real-time. Besides subject based variability, biomedical image data also varies due to variations in their imaging techniques, leading to distribution differences between the image databases. Hence a classifier learned on one database may perform poorly on the other database. Another significant contribution of this dissertation has been the design and development of an efficient biomedical image data annotation framework, based on a novel combination of transfer learning and a new batch-mode active learning method, capable of addressing the distribution differences across databases. The methodologies developed in this dissertation are relevant and applicable to a large set of computing problems where there is a high variation of data between subjects or sources, such as face detection, pose detection and speech recognition. From a broader perspective, these frameworks can be viewed as a first step towards design of automated adaptive systems for real world data.
ContributorsChattopadhyay, Rita (Author) / Panchanathan, Sethuraman (Thesis advisor) / Ye, Jieping (Thesis advisor) / Li, Baoxin (Committee member) / Santello, Marco (Committee member) / Arizona State University (Publisher)
Created2013
151940-Thumbnail Image.png
Description
Biological systems are complex in many dimensions as endless transportation and communication networks all function simultaneously. Our ability to intervene within both healthy and diseased systems is tied directly to our ability to understand and model core functionality. The progress in increasingly accurate and thorough high-throughput measurement technologies has provided

Biological systems are complex in many dimensions as endless transportation and communication networks all function simultaneously. Our ability to intervene within both healthy and diseased systems is tied directly to our ability to understand and model core functionality. The progress in increasingly accurate and thorough high-throughput measurement technologies has provided a deluge of data from which we may attempt to infer a representation of the true genetic regulatory system. A gene regulatory network model, if accurate enough, may allow us to perform hypothesis testing in the form of computational experiments. Of great importance to modeling accuracy is the acknowledgment of biological contexts within the models -- i.e. recognizing the heterogeneous nature of the true biological system and the data it generates. This marriage of engineering, mathematics and computer science with systems biology creates a cycle of progress between computer simulation and lab experimentation, rapidly translating interventions and treatments for patients from the bench to the bedside. This dissertation will first discuss the landscape for modeling the biological system, explore the identification of targets for intervention in Boolean network models of biological interactions, and explore context specificity both in new graphical depictions of models embodying context-specific genomic regulation and in novel analysis approaches designed to reveal embedded contextual information. Overall, the dissertation will explore a spectrum of biological modeling with a goal towards therapeutic intervention, with both formal and informal notions of biological context, in such a way that will enable future work to have an even greater impact in terms of direct patient benefit on an individualized level.
ContributorsVerdicchio, Michael (Author) / Kim, Seungchan (Thesis advisor) / Baral, Chitta (Committee member) / Stolovitzky, Gustavo (Committee member) / Collofello, James (Committee member) / Arizona State University (Publisher)
Created2013