This growing collection consists of scholarly works authored by ASU-affiliated faculty, staff, and community members, and it contains many open access articles. ASU-affiliated authors are encouraged to Share Your Work in KEEP.

Displaying 1 - 10 of 57
Filtering by

Clear all filters

141461-Thumbnail Image.png
Description
In the digital humanities, there is a constant need to turn images and PDF files into plain text to apply analyses such as topic modelling, named entity recognition, and other techniques. However, although there exist different solutions to extract text embedded in PDF files or run OCR on images, they

In the digital humanities, there is a constant need to turn images and PDF files into plain text to apply analyses such as topic modelling, named entity recognition, and other techniques. However, although there exist different solutions to extract text embedded in PDF files or run OCR on images, they typically require additional training (for example, scholars have to learn how to use the command line) or are difficult to automate without programming skills. The Giles Ecosystem is a distributed system based on Apache Kafka that allows users to upload documents for text and image extraction. The system components are implemented using Java and the Spring Framework and are available under an Open Source license on GitHub (https://github.com/diging/).
ContributorsLessios-Damerow, Julia (Contributor) / Peirson, Erick (Contributor) / Laubichler, Manfred (Contributor) / ASU-SFI Center for Biosocial Complex Systems (Contributor)
Created2017-09-28
129588-Thumbnail Image.png
Description

A globally integrated carbon observation and analysis system is needed to improve the fundamental understanding of the global carbon cycle, to improve our ability to project future changes, and to verify the effectiveness of policies aiming to reduce greenhouse gas emissions and increase carbon sequestration. Building an integrated carbon observation

A globally integrated carbon observation and analysis system is needed to improve the fundamental understanding of the global carbon cycle, to improve our ability to project future changes, and to verify the effectiveness of policies aiming to reduce greenhouse gas emissions and increase carbon sequestration. Building an integrated carbon observation system requires transformational advances from the existing sparse, exploratory framework towards a dense, robust, and sustained system in all components: anthropogenic emissions, the atmosphere, the ocean, and the terrestrial biosphere. The paper is addressed to scientists, policymakers, and funding agencies who need to have a global picture of the current state of the (diverse) carbon observations.

We identify the current state of carbon observations, and the needs and notional requirements for a global integrated carbon observation system that can be built in the next decade. A key conclusion is the substantial expansion of the ground-based observation networks required to reach the high spatial resolution for CO2 and CH4 fluxes, and for carbon stocks for addressing policy-relevant objectives, and attributing flux changes to underlying processes in each region. In order to establish flux and stock diagnostics over areas such as the southern oceans, tropical forests, and the Arctic, in situ observations will have to be complemented with remote-sensing measurements. Remote sensing offers the advantage of dense spatial coverage and frequent revisit. A key challenge is to bring remote-sensing measurements to a level of long-term consistency and accuracy so that they can be efficiently combined in models to reduce uncertainties, in synergy with ground-based data.

Bringing tight observational constraints on fossil fuel and land use change emissions will be the biggest challenge for deployment of a policy-relevant integrated carbon observation system. This will require in situ and remotely sensed data at much higher resolution and density than currently achieved for natural fluxes, although over a small land area (cities, industrial sites, power plants), as well as the inclusion of fossil fuel CO2 proxy measurements such as radiocarbon in CO2 and carbon-fuel combustion tracers. Additionally, a policy-relevant carbon monitoring system should also provide mechanisms for reconciling regional top-down (atmosphere-based) and bottom-up (surface-based) flux estimates across the range of spatial and temporal scales relevant to mitigation policies. In addition, uncertainties for each observation data-stream should be assessed. The success of the system will rely on long-term commitments to monitoring, on improved international collaboration to fill gaps in the current observations, on sustained efforts to improve access to the different data streams and make databases interoperable, and on the calibration of each component of the system to agreed-upon international scales.

ContributorsCiais, P. (Author) / Dolman, A. J. (Author) / Bombelli, A. (Author) / Duren, R. (Author) / Peregon, A. (Author) / Rayner, P. J. (Author) / Miller, C. (Author) / Gobron, N. (Author) / Kinderman, G. (Author) / Marland, G. (Author) / Gruber, N. (Author) / Chevallier, F. (Author) / Andres, R. J. (Author) / Balsamo, G. (Author) / Bopp, L. (Author) / Breon, F. -M. (Author) / Broquet, G. (Author) / Dargaville, R. (Author) / Battin, T. J. (Author) / Borges, A. (Author) / Bovensmann, H. (Author) / Buchwitz, M. (Author) / Butler, J. (Author) / Canadell, J. G. (Author) / Cook, R. B. (Author) / DeFries, R. (Author) / Engelen, R. (Author) / Gurney, Kevin (Author) / Heinze, C. (Author) / Heimann, M. (Author) / Held, A. (Author) / Henry, M. (Author) / Law, B. (Author) / Luyssaert, S. (Author) / Miller, J. (Author) / Moriyama, T. (Author) / Moulin, C. (Author) / Myneni, R. (Author) / College of Liberal Arts and Sciences (Contributor)
Created2013-11-30
129478-Thumbnail Image.png
Description

Errors in the specification or utilization of fossil fuel CO2 emissions within carbon budget or atmospheric CO2 inverse studies can alias the estimation of biospheric and oceanic carbon exchange. A key component in the simulation of CO2 concentrations arising from fossil fuel emissions is the spatial distribution of the emission

Errors in the specification or utilization of fossil fuel CO2 emissions within carbon budget or atmospheric CO2 inverse studies can alias the estimation of biospheric and oceanic carbon exchange. A key component in the simulation of CO2 concentrations arising from fossil fuel emissions is the spatial distribution of the emission near coastlines. Regridding of fossil fuel CO2 emissions (FFCO2) from fine to coarse grids to enable atmospheric transport simulations can give rise to mismatches between the emissions and simulated atmospheric dynamics which differ over land or water. For example, emissions originally emanating from the land are emitted from a grid cell for which the vertical mixing reflects the roughness and/or surface energy exchange of an ocean surface. We test this potential "dynamical inconsistency" by examining simulated global atmospheric CO2 concentration driven by two different approaches to regridding fossil fuel CO2 emissions. The two approaches are as follows: (1) a commonly used method that allocates emissions to grid cells with no attempt to ensure dynamical consistency with atmospheric transport and (2) an improved method that reallocates emissions to grid cells to ensure dynamically consistent results. Results show large spatial and temporal differences in the simulated CO2 concentration when comparing these two approaches. The emissions difference ranges from −30.3 TgC grid cell-1 yr-1 (−3.39 kgC m-2 yr-1) to +30.0 TgC grid cell-1 yr-1 (+2.6 kgC m-2 yr-1) along coastal margins. Maximum simulated annual mean CO2 concentration differences at the surface exceed ±6 ppm at various locations and times. Examination of the current CO2 monitoring locations during the local afternoon, consistent with inversion modeling system sampling and measurement protocols, finds maximum hourly differences at 38 stations exceed ±0.10 ppm with individual station differences exceeding −32 ppm. The differences implied by not accounting for this dynamical consistency problem are largest at monitoring sites proximal to large coastal urban areas and point sources. These results suggest that studies comparing simulated to observed atmospheric CO2 concentration, such as atmospheric CO2 inversions, must take measures to correct for this potential problem and ensure flux and dynamical consistency.

ContributorsZhang, X. (Author) / Gurney, Kevin (Author) / Rayner, P. (Author) / Liu, Y. (Author) / Asefi-Najafabady, Salvi (Author) / College of Liberal Arts and Sciences (Contributor)
Created2013-11-30
Description

Background: Meiotic recombination has traditionally been explained based on the structural requirement to stabilize homologous chromosome pairs to ensure their proper meiotic segregation. Competing hypotheses seek to explain the emerging findings of significant heterogeneity in recombination rates within and between genomes, but intraspecific comparisons of genome-wide recombination patterns are rare.

Background: Meiotic recombination has traditionally been explained based on the structural requirement to stabilize homologous chromosome pairs to ensure their proper meiotic segregation. Competing hypotheses seek to explain the emerging findings of significant heterogeneity in recombination rates within and between genomes, but intraspecific comparisons of genome-wide recombination patterns are rare. The honey bee (Apis mellifera) exhibits the highest rate of genomic recombination among multicellular animals with about five cross-over events per chromatid.

Results: Here, we present a comparative analysis of recombination rates across eight genetic linkage maps of the honey bee genome to investigate which genomic sequence features are correlated with recombination rate and with its variation across the eight data sets, ranging in average marker spacing ranging from 1 Mbp to 120 kbp. Overall, we found that GC content explained best the variation in local recombination rate along chromosomes at the analyzed 100 kbp scale. In contrast, variation among the different maps was correlated to the abundance of microsatellites and several specific tri- and tetra-nucleotides.

Conclusions: The combined evidence from eight medium-scale recombination maps of the honey bee genome suggests that recombination rate variation in this highly recombining genome might be due to the DNA configuration instead of distinct sequence motifs. However, more fine-scale analyses are needed. The empirical basis of eight differing genetic maps allowed for robust conclusions about the correlates of the local recombination rates and enabled the study of the relation between DNA features and variability in local recombination rates, which is particularly relevant in the honey bee genome with its exceptionally high recombination rate.

ContributorsRoss, Caitlin R. (Author) / DeFelice, Dominick S. (Author) / Hunt, Greg J. (Author) / Ihle, Kate (Author) / Amdam, Gro (Author) / Rueppell, Olav (Author) / College of Liberal Arts and Sciences (Contributor)
Created2015-02-21
129259-Thumbnail Image.png
Description

What's a profession without a code of ethics? Being a legitimate profession almost requires drafting a code and, at least nominally, making members follow it. Codes of ethics (henceforth “codes”) exist for a number of reasons, many of which can vary widely from profession to profession - but above all

What's a profession without a code of ethics? Being a legitimate profession almost requires drafting a code and, at least nominally, making members follow it. Codes of ethics (henceforth “codes”) exist for a number of reasons, many of which can vary widely from profession to profession - but above all they are a form of codified self-regulation. While codes can be beneficial, it argues that when we scratch below the surface, there are many problems at their root. In terms of efficacy, codes can serve as a form of ethical window dressing, rather than effective rules for behavior. But even more that, codes can degrade the meaning behind being a good person who acts ethically for the right reasons.

Created2013-11-30
Description

High-resolution, global quantification of fossil fuel CO[subscript 2] emissions is emerging as a critical need in carbon cycle science and climate policy. We build upon a previously developed fossil fuel data assimilation system (FFDAS) for estimating global high-resolution fossil fuel CO[subscript 2] emissions. We have improved the underlying observationally based

High-resolution, global quantification of fossil fuel CO[subscript 2] emissions is emerging as a critical need in carbon cycle science and climate policy. We build upon a previously developed fossil fuel data assimilation system (FFDAS) for estimating global high-resolution fossil fuel CO[subscript 2] emissions. We have improved the underlying observationally based data sources, expanded the approach through treatment of separate emitting sectors including a new pointwise database of global power plants, and extended the results to cover a 1997 to 2010 time series at a spatial resolution of 0.1°. Long-term trend analysis of the resulting global emissions shows subnational spatial structure in large active economies such as the United States, China, and India. These three countries, in particular, show different long-term trends and exploration of the trends in nighttime lights, and population reveal a decoupling of population and emissions at the subnational level. Analysis of shorter-term variations reveals the impact of the 2008–2009 global financial crisis with widespread negative emission anomalies across the U.S. and Europe. We have used a center of mass (CM) calculation as a compact metric to express the time evolution of spatial patterns in fossil fuel CO[subscript 2] emissions. The global emission CM has moved toward the east and somewhat south between 1997 and 2010, driven by the increase in emissions in China and South Asia over this time period. Analysis at the level of individual countries reveals per capita CO[subscript 2] emission migration in both Russia and India. The per capita emission CM holds potential as a way to succinctly analyze subnational shifts in carbon intensity over time. Uncertainties are generally lower than the previous version of FFDAS due mainly to an improved nightlight data set.

ContributorsAsefi-Najafabady, Salvi (Author) / Rayner, P. J. (Author) / Gurney, Kevin (Author) / McRobert, A. (Author) / Song, Y. (Author) / Coltin, K. (Author) / Huang, J. (Author) / Elvidge, C. (Author) / Baugh, K. (Author) / College of Liberal Arts and Sciences (Contributor)
Created2014-09-16
128778-Thumbnail Image.png
Description

Online communities are becoming increasingly important as platforms for large-scale human cooperation. These communities allow users seeking and sharing professional skills to solve problems collaboratively. To investigate how users cooperate to complete a large number of knowledge-producing tasks, we analyze Stack Exchange, one of the largest question and answer systems

Online communities are becoming increasingly important as platforms for large-scale human cooperation. These communities allow users seeking and sharing professional skills to solve problems collaboratively. To investigate how users cooperate to complete a large number of knowledge-producing tasks, we analyze Stack Exchange, one of the largest question and answer systems in the world. We construct attention networks to model the growth of 110 communities in the Stack Exchange system and quantify individual answering strategies using the linking dynamics on attention networks. We identify two answering strategies. Strategy A aims at performing maintenance by doing simple tasks, whereas strategy B aims at investing time in doing challenging tasks. Both strategies are important: empirical evidence shows that strategy A decreases the median waiting time for answers and strategy B increases the acceptance rate of answers. In investigating the strategic persistence of users, we find that users tends to stick on the same strategy over time in a community, but switch from one strategy to the other across communities. This finding reveals the different sets of knowledge and skills between users. A balance between the population of users taking A and B strategies that approximates 2:1, is found to be optimal to the sustainable growth of communities.

ContributorsWu, Lingfei (Author) / Baggio, Jacopo (Author) / Janssen, Marco (Author) / ASU-SFI Center for Biosocial Complex Systems (Contributor)
Created2016-03-02
128736-Thumbnail Image.png
Description

Honeybee workers are essentially sterile female helpers that make up the majority of individuals in a colony. Workers display a marked change in physiology when they transition from in-nest tasks to foraging. Recent technological advances have made it possible to unravel the metabolic modifications associated with this transition. Previous studies

Honeybee workers are essentially sterile female helpers that make up the majority of individuals in a colony. Workers display a marked change in physiology when they transition from in-nest tasks to foraging. Recent technological advances have made it possible to unravel the metabolic modifications associated with this transition. Previous studies have revealed extensive remodeling of brain, thorax, and hypopharyngeal gland biochemistry. However, data on changes in the abdomen is scarce. To narrow this gap we investigated the proteomic composition of abdominal tissue in the days typically preceding the onset of foraging in honeybee workers.

In order to get a broader representation of possible protein dynamics, we used workers of two genotypes with differences in the age at which they initiate foraging. This approach was combined with RNA interference-mediated downregulation of an insulin/insulin-like signaling component that is central to foraging behavior, the insulin receptor substrate (irs), and with measurements of glucose and lipid levels.
Our data provide new insight into the molecular underpinnings of phenotypic plasticity in the honeybee, invoke parallels with vertebrate metabolism, and support an integrated and irs-dependent association of carbohydrate and lipid metabolism with the transition from in-nest tasks to foraging.

ContributorsChan, Queenie W. T. (Author) / Mutti, Navdeep (Author) / Foster, Leonard J. (Author) / Kocher, Sarah D. (Author) / Amdam, Gro (Author) / Wolschin, Florian (Author) / College of Liberal Arts and Sciences (Contributor)
Created2011-09-28
128995-Thumbnail Image.png
Description

Background: Obesity is a metabolic disease caused by environmental and genetic factors. However, the epigenetic mechanisms of obesity are incompletely understood. The aim of our study was to investigate the role of skeletal muscle DNA methylation in combination with transcriptomic changes in obesity.

Results: Muscle biopsies were obtained basally from lean (n = 12; BMI = 23.4 ± 0.7

Background: Obesity is a metabolic disease caused by environmental and genetic factors. However, the epigenetic mechanisms of obesity are incompletely understood. The aim of our study was to investigate the role of skeletal muscle DNA methylation in combination with transcriptomic changes in obesity.

Results: Muscle biopsies were obtained basally from lean (n = 12; BMI = 23.4 ± 0.7 kg/m[superscript 2]) and obese (n = 10; BMI = 32.9 ± 0.7 kg/m[superscript 2]) participants in combination with euglycemic-hyperinsulinemic clamps to assess insulin sensitivity. We performed reduced representation bisulfite sequencing (RRBS) next-generation methylation and microarray analyses on DNA and RNA isolated from vastus lateralis muscle biopsies. There were 13,130 differentially methylated cytosines (DMC; uncorrected P < 0.05) that were altered in the promoter and untranslated (5' and 3'UTR) regions in the obese versus lean analysis. Microarray analysis revealed 99 probes that were significantly (corrected P < 0.05) altered. Of these, 12 genes (encompassing 22 methylation sites) demonstrated a negative relationship between gene expression and DNA methylation. Specifically, sorbin and SH3 domain containing 3 (SORBS3) which codes for the adapter protein vinexin was significantly decreased in gene expression (fold change −1.9) and had nine DMCs that were significantly increased in methylation in obesity (methylation differences ranged from 5.0 to 24.4 %). Moreover, differentially methylated region (DMR) analysis identified a region in the 5'UTR (Chr.8:22,423,530–22,423,569) of SORBS3 that was increased in methylation by 11.2 % in the obese group. The negative relationship observed between DNA methylation and gene expression for SORBS3 was validated by a site-specific sequencing approach, pyrosequencing, and qRT-PCR. Additionally, we performed transcription factor binding analysis and identified a number of transcription factors whose binding to the differentially methylated sites or region may contribute to obesity.

Conclusions: These results demonstrate that obesity alters the epigenome through DNA methylation and highlights novel transcriptomic changes in SORBS3 in skeletal muscle.

ContributorsDay, Samantha (Author) / Coletta, Rich (Author) / Kim, Joon Young (Author) / Campbell, Latoya (Author) / Benjamin, Tonya R. (Author) / Roust, Lori R. (Author) / De Filippis, Elena A. (Author) / Dinu, Valentin (Author) / Shaibi, Gabriel (Author) / Mandarino, Lawrence J. (Author) / Coletta, Dawn (Author) / College of Liberal Arts and Sciences (Contributor)
Created2016-07-18
128932-Thumbnail Image.png
Description

We have previously hypothesized a biological pathway of activity-dependent synaptic plasticity proteins that addresses the dual genetic and environmental contributions to schizophrenia. Accordingly, variations in the immediate early gene EGR3, and its target ARC, should influence schizophrenia susceptibility. We used a pooled Next-Generation Sequencing approach to identify variants across these

We have previously hypothesized a biological pathway of activity-dependent synaptic plasticity proteins that addresses the dual genetic and environmental contributions to schizophrenia. Accordingly, variations in the immediate early gene EGR3, and its target ARC, should influence schizophrenia susceptibility. We used a pooled Next-Generation Sequencing approach to identify variants across these genes in U.S. populations of European (EU) and African (AA) descent. Three EGR3 and one ARC SNP were selected and genotyped for validation, and three SNPs were tested for association in a replication cohort. In the EU group of 386 schizophrenia cases and 150 controls EGR3 SNP rs1877670 and ARC SNP rs35900184 showed significant associations (p = 0.0078 and p = 0.0275, respectively). In the AA group of 185 cases and 50 controls, only the ARC SNP revealed significant association (p = 0.0448). The ARC SNP did not show association in the Han Chinese (CH) population. However, combining the EU, AA, and CH groups revealed a highly significant association of ARC SNP rs35900184 (p = 2.353 x 10-7; OR [95% CI] = 1.54 [1.310–1.820]). These findings support previously reported associations between EGR3 and schizophrenia. Moreover, this is the first report associating an ARC SNP with schizophrenia and supports recent large-scale GWAS findings implicating the ARC complex in schizophrenia risk. These results support the need for further investigation of the proposed pathway of environmentally responsive, synaptic plasticity-related, schizophrenia genes.

ContributorsHuentelman, Matthew J. (Author) / Muppana, Leela (Author) / Courneveaux, Jason J. (Author) / Dinu, Valentin (Author) / Pruzin, Jeremy J. (Author) / Reiman, Rebecca (Author) / Borish, Cassie N. (Author) / De Both, Matt (Author) / Ahmed, Amber (Author) / Todorov, Alexandre (Author) / Cloninger, C. Robert (Author) / Zhang, Rui (Author) / Ma, Jie (Author) / Gallitano, Amelia L. (Author) / College of Health Solutions (Contributor)
Created2015-10-16