This growing collection consists of scholarly works authored by ASU-affiliated faculty, staff, and community members, and it contains many open access articles. ASU-affiliated authors are encouraged to Share Your Work in KEEP.

Displaying 1 - 10 of 31
Filtering by

Clear all filters

141461-Thumbnail Image.png
Description
In the digital humanities, there is a constant need to turn images and PDF files into plain text to apply analyses such as topic modelling, named entity recognition, and other techniques. However, although there exist different solutions to extract text embedded in PDF files or run OCR on images, they

In the digital humanities, there is a constant need to turn images and PDF files into plain text to apply analyses such as topic modelling, named entity recognition, and other techniques. However, although there exist different solutions to extract text embedded in PDF files or run OCR on images, they typically require additional training (for example, scholars have to learn how to use the command line) or are difficult to automate without programming skills. The Giles Ecosystem is a distributed system based on Apache Kafka that allows users to upload documents for text and image extraction. The system components are implemented using Java and the Spring Framework and are available under an Open Source license on GitHub (https://github.com/diging/).
ContributorsLessios-Damerow, Julia (Contributor) / Peirson, Erick (Contributor) / Laubichler, Manfred (Contributor) / ASU-SFI Center for Biosocial Complex Systems (Contributor)
Created2017-09-28
128166-Thumbnail Image.png
Description

At the end of the dark ages, anatomy was taught as though everything that could be known was known. Scholars learned about what had been discovered rather than how to make discoveries. This was true even though the body (and the rest of biology) was very poorly understood. The renaissance

At the end of the dark ages, anatomy was taught as though everything that could be known was known. Scholars learned about what had been discovered rather than how to make discoveries. This was true even though the body (and the rest of biology) was very poorly understood. The renaissance eventually brought a revolution in how scholars (and graduate students) were trained and worked. This revolution never occurred in K-12 or university education such that we now teach young students in much the way that scholars were taught in the dark ages, we teach them what is already known rather than the process of knowing. Citizen science offers a way to change K-12 and university education and, in doing so, complete the renaissance. Here we offer an example of such an approach and call for change in the way students are taught science, change that is more possible than it has ever been and is, nonetheless, five hundred years delayed.

Created2016-03-01
127872-Thumbnail Image.png
Description

Background: Modern advances in sequencing technology have enabled the census of microbial members of many natural ecosystems. Recently, attention is increasingly being paid to the microbial residents of human-made, built ecosystems, both private (homes) and public (subways, office buildings, and hospitals). Here, we report results of the characterization of the microbial

Background: Modern advances in sequencing technology have enabled the census of microbial members of many natural ecosystems. Recently, attention is increasingly being paid to the microbial residents of human-made, built ecosystems, both private (homes) and public (subways, office buildings, and hospitals). Here, we report results of the characterization of the microbial ecology of a singular built environment, the International Space Station (ISS). This ISS sampling involved the collection and microbial analysis (via 16S rRNA gene PCR) of 15 surfaces sampled by swabs onboard the ISS. This sampling was a component of Project MERCCURI (Microbial Ecology Research Combining Citizen and University Researchers on ISS). Learning more about the microbial inhabitants of the “buildings” in which we travel through space will take on increasing importance, as plans for human exploration continue, with the possibility of colonization of other planets and moons.

Results: Sterile swabs were used to sample 15 surfaces onboard the ISS. The sites sampled were designed to be analogous to samples collected for (1) the Wildlife of Our Homes project and (2) a study of cell phones and shoes that were concurrently being collected for another component of Project MERCCURI. Sequencing of the 16S rRNA genes amplified from DNA extracted from each swab was used to produce a census of the microbes present on each surface sampled. We compared the microbes found on the ISS swabs to those from both homes on Earth and data from the Human Microbiome Project.

Conclusions: While significantly different from homes on Earth and the Human Microbiome Project samples analyzed here, the microbial community composition on the ISS was more similar to home surfaces than to the human microbiome samples. The ISS surfaces are OTU-rich with 1,036–4,294 operational taxonomic units (OTUs per sample). There was no discernible biogeography of microbes on the 15 ISS surfaces, although this may be a reflection of the small sample size we were able to obtain.

ContributorsLang, Jenna M. (Author) / Coil, David A. (Author) / Neches, Russell Y. (Author) / Brown, Wendy E. (Author) / Cavalier, Darlene (Author) / Severance, Mark (Author) / Hampton-Marcell, Jarrad T. (Author) / Gilbert, Jack A. (Author) / Eisen, Jonathan A. (Author) / ASU-SFI Center for Biosocial Complex Systems (Contributor)
Created2017-12-05
128778-Thumbnail Image.png
Description

Online communities are becoming increasingly important as platforms for large-scale human cooperation. These communities allow users seeking and sharing professional skills to solve problems collaboratively. To investigate how users cooperate to complete a large number of knowledge-producing tasks, we analyze Stack Exchange, one of the largest question and answer systems

Online communities are becoming increasingly important as platforms for large-scale human cooperation. These communities allow users seeking and sharing professional skills to solve problems collaboratively. To investigate how users cooperate to complete a large number of knowledge-producing tasks, we analyze Stack Exchange, one of the largest question and answer systems in the world. We construct attention networks to model the growth of 110 communities in the Stack Exchange system and quantify individual answering strategies using the linking dynamics on attention networks. We identify two answering strategies. Strategy A aims at performing maintenance by doing simple tasks, whereas strategy B aims at investing time in doing challenging tasks. Both strategies are important: empirical evidence shows that strategy A decreases the median waiting time for answers and strategy B increases the acceptance rate of answers. In investigating the strategic persistence of users, we find that users tends to stick on the same strategy over time in a community, but switch from one strategy to the other across communities. This finding reveals the different sets of knowledge and skills between users. A balance between the population of users taking A and B strategies that approximates 2:1, is found to be optimal to the sustainable growth of communities.

ContributorsWu, Lingfei (Author) / Baggio, Jacopo (Author) / Janssen, Marco (Author) / ASU-SFI Center for Biosocial Complex Systems (Contributor)
Created2016-03-02
129259-Thumbnail Image.png
Description

What's a profession without a code of ethics? Being a legitimate profession almost requires drafting a code and, at least nominally, making members follow it. Codes of ethics (henceforth “codes”) exist for a number of reasons, many of which can vary widely from profession to profession - but above all

What's a profession without a code of ethics? Being a legitimate profession almost requires drafting a code and, at least nominally, making members follow it. Codes of ethics (henceforth “codes”) exist for a number of reasons, many of which can vary widely from profession to profession - but above all they are a form of codified self-regulation. While codes can be beneficial, it argues that when we scratch below the surface, there are many problems at their root. In terms of efficacy, codes can serve as a form of ethical window dressing, rather than effective rules for behavior. But even more that, codes can degrade the meaning behind being a good person who acts ethically for the right reasons.

Created2013-11-30
129675-Thumbnail Image.png
Description

As part of a larger trend across industrialized nations, European research policy discourse has placed increasing emphasis on socio-technical integration: the explicit incorporation of activities devoted to broader social aspects into scientific activities. In order to compare these high-level integration discourses against patterns at the level of resource allocation, we

As part of a larger trend across industrialized nations, European research policy discourse has placed increasing emphasis on socio-technical integration: the explicit incorporation of activities devoted to broader social aspects into scientific activities. In order to compare these high-level integration discourses against patterns at the level of resource allocation, we analyze nearly 2500 research solicitations from the three European Framework Programmes for R&D during the period 1998-2010. We identify four distinct types of integration (socio-ethical, stakeholder, socio-economic and industrial) that occur either as core or parallel components of R&D solicitations. Quantitative analysis reveals an overall trend towards increasing integration, with requests integrating industrial and socio-economic aspects substantially outnumbering those integrating socio-ethical and stakeholder aspects-by a 2 to 1 margin. Meanwhile, calls for socio-technical integration have become slightly more extensive (ranging across a broader range of research areas addressed), significantly more pervasive (shifting from the periphery to the core of R&D practices), and arguably less diverse (involving a wider variety of integration types) over time. The relative lack of attention to socio-ethical aspects and stakeholder participation in European research is particularly notable given that we focus on potentially controversial areas (life sciences, energy, and nanotechnology), which likely overemphasizes the prevalence of integration throughout the Framework Programmes.

Created2013-08-15
129652-Thumbnail Image.png
Description

The similarity attraction hypothesis posits that humans are drawn toward others who behave and appear similar to themselves. Two experiments examined this hypothesis with middle-school students learning electrical circuit analysis in a computer-based environment with an Animated Pedagogical Agent (APA). Experiment 1 was designed to determine whether matching the gender

The similarity attraction hypothesis posits that humans are drawn toward others who behave and appear similar to themselves. Two experiments examined this hypothesis with middle-school students learning electrical circuit analysis in a computer-based environment with an Animated Pedagogical Agent (APA). Experiment 1 was designed to determine whether matching the gender of the APA to the student has a positive impact on learning outcomes or student perceptions. One hundred ninety-seven middle-school students learned with the computer-based environment using an APA that matched their gender or one which was opposite in gender. Female students reported higher program ratings when the APA matched their gender. Male students, on the other hand, reported higher program ratings than females when the APA did not match their gender. Experiment 2 systematically tested the impact of providing learners the choice among four APAs on learning outcomes and student perceptions. Three hundred thirty-four middle-school students received either a pre-assigned random APA or were free to choose from four APA options: young male agent, older male agent, young female agent, or older female agent. Learners had higher far transfer scores when provided a choice of animated agent, but student perceptions were not impacted by having the ability to make this choice. We suggest that offering students learner control positively impacts student motivation and learning by increasing student perceptions of autonomy, responsibility for the success of the instructional materials, and global satisfaction with the design of materials.

ContributorsOzogul, Gamze (Author) / Johnson, Amy (Author) / Atkinson, Robert (Author) / Reisslein, Martin (Author) / ASU-SFI Center for Biosocial Complex Systems (Contributor)
Created2013-09-12
128886-Thumbnail Image.png
Description

Species turnover or β diversity is a conceptually attractive surrogate for conservation planning. However, there has been only 1 attempt to determine how well sites selected to maximize β diversity represent species, and that test was done at a scale too coarse (2,500 km2 sites) to inform most conservation decisions.

Species turnover or β diversity is a conceptually attractive surrogate for conservation planning. However, there has been only 1 attempt to determine how well sites selected to maximize β diversity represent species, and that test was done at a scale too coarse (2,500 km2 sites) to inform most conservation decisions. We used 8 plant datasets, 3 bird datasets, and 1 mammal dataset to evaluate whether sites selected to span β diversity will efficiently represent species at finer scale (sites sizes < 1 ha to 625 km2). We used ordinations to characterize dissimilarity in species assemblages (β diversity) among plots (inventory data) or among grid cells (atlas data). We then selected sites to maximize β diversity and used the Species Accumulation Index, SAI, to evaluate how efficiently the surrogate (selecting sites for maximum β diversity) represented species in the same taxon. Across all 12 datasets, sites selected for maximum β diversity represented species with a median efficiency of 24% (i.e., the surrogate was 24% more effective than random selection of sites), and an interquartile range of 4% to 41% efficiency. β diversity was a better surrogate for bird datasets than for plant datasets, and for atlas datasets with 10-km to 14-km grid cells than for atlas datasets with 25-km grid cells. We conclude that β diversity is more than a mere descriptor of how species are distributed on the landscape; in particular β diversity might be useful to maximize the complementarity of a set of sites. Because we tested only within-taxon surrogacy, our results do not prove that β diversity is useful for conservation planning. But our results do justify further investigation to identify the circumstances in which β diversity performs well, and to evaluate it as a cross-taxon surrogate.

Created2016-03-04
128625-Thumbnail Image.png
Description

A major challenge for biogeographers and conservation planners is to identify where to best locate or distribute high-priority areas for conservation and to explore whether these areas are well represented by conservation actions such as protected areas (PAs). We aimed to identify high-priority areas for conservation, expressed as hotpots of

A major challenge for biogeographers and conservation planners is to identify where to best locate or distribute high-priority areas for conservation and to explore whether these areas are well represented by conservation actions such as protected areas (PAs). We aimed to identify high-priority areas for conservation, expressed as hotpots of rarity-weighted richness (HRR)–sites that efficiently represent species–for birds across EU countries, and to explore whether HRR are well represented by the Natura 2000 network. Natura 2000 is an evolving network of PAs that seeks to conserve biodiversity through the persistence of the most patrimonial species and habitats across Europe. This network includes Sites of Community Importance (SCI) and Special Areas of Conservation (SAC), where the latter regulated the designation of Special Protected Areas (SPA). Distribution maps for 416 bird species and complementarity-based approaches were used to map geographical patterns of rarity-weighted richness (RWR) and HRR for birds. We used species accumulation index to evaluate whether RWR was efficient surrogates to identify HRRs for birds. The results of our analysis support the proposition that prioritizing sites in order of RWR is a reliable way to identify sites that efficiently represent birds. HRRs were concentrated in the Mediterranean Basin and alpine and boreal biogeographical regions of northern Europe. The cells with high RWR values did not correspond to cells where Natura 2000 was present. We suggest that patterns of RWR could become a focus for conservation biogeography. Our analysis demonstrates that identifying HRR is a robust approach for prioritizing management actions, and reveals the need for more conservation actions, especially on HRR.

Created2017-04-05
128707-Thumbnail Image.png
Description

Recent advances in nonequilibrium statistical physics have provided unprecedented insight into the thermodynamics of dynamic processes. The author recently used these advances to extend Landauer’s semi-formal reasoning concerning the thermodynamics of bit erasure, to derive the minimal free energy required to implement an arbitrary computation. Here, I extend this analysis,

Recent advances in nonequilibrium statistical physics have provided unprecedented insight into the thermodynamics of dynamic processes. The author recently used these advances to extend Landauer’s semi-formal reasoning concerning the thermodynamics of bit erasure, to derive the minimal free energy required to implement an arbitrary computation. Here, I extend this analysis, deriving the minimal free energy required by an organism to run a given (stochastic) map π from its sensor inputs to its actuator outputs. I use this result to calculate the input-output map π of an organism that optimally trades off the free energy needed to run π with the phenotypic fitness that results from implementing π. I end with a general discussion of the limits imposed on the rate of the terrestrial biosphere’s information processing by the flux of sunlight on the Earth.

Created2016-04-13