This growing collection consists of scholarly works authored by ASU-affiliated faculty, staff, and community members, and it contains many open access articles. ASU-affiliated authors are encouraged to Share Your Work in KEEP.

Displaying 1 - 10 of 60
Filtering by

Clear all filters

141461-Thumbnail Image.png
Description
In the digital humanities, there is a constant need to turn images and PDF files into plain text to apply analyses such as topic modelling, named entity recognition, and other techniques. However, although there exist different solutions to extract text embedded in PDF files or run OCR on images, they

In the digital humanities, there is a constant need to turn images and PDF files into plain text to apply analyses such as topic modelling, named entity recognition, and other techniques. However, although there exist different solutions to extract text embedded in PDF files or run OCR on images, they typically require additional training (for example, scholars have to learn how to use the command line) or are difficult to automate without programming skills. The Giles Ecosystem is a distributed system based on Apache Kafka that allows users to upload documents for text and image extraction. The system components are implemented using Java and the Spring Framework and are available under an Open Source license on GitHub (https://github.com/diging/).
ContributorsLessios-Damerow, Julia (Contributor) / Peirson, Erick (Contributor) / Laubichler, Manfred (Contributor) / ASU-SFI Center for Biosocial Complex Systems (Contributor)
Created2017-09-28
129588-Thumbnail Image.png
Description

A globally integrated carbon observation and analysis system is needed to improve the fundamental understanding of the global carbon cycle, to improve our ability to project future changes, and to verify the effectiveness of policies aiming to reduce greenhouse gas emissions and increase carbon sequestration. Building an integrated carbon observation

A globally integrated carbon observation and analysis system is needed to improve the fundamental understanding of the global carbon cycle, to improve our ability to project future changes, and to verify the effectiveness of policies aiming to reduce greenhouse gas emissions and increase carbon sequestration. Building an integrated carbon observation system requires transformational advances from the existing sparse, exploratory framework towards a dense, robust, and sustained system in all components: anthropogenic emissions, the atmosphere, the ocean, and the terrestrial biosphere. The paper is addressed to scientists, policymakers, and funding agencies who need to have a global picture of the current state of the (diverse) carbon observations.

We identify the current state of carbon observations, and the needs and notional requirements for a global integrated carbon observation system that can be built in the next decade. A key conclusion is the substantial expansion of the ground-based observation networks required to reach the high spatial resolution for CO2 and CH4 fluxes, and for carbon stocks for addressing policy-relevant objectives, and attributing flux changes to underlying processes in each region. In order to establish flux and stock diagnostics over areas such as the southern oceans, tropical forests, and the Arctic, in situ observations will have to be complemented with remote-sensing measurements. Remote sensing offers the advantage of dense spatial coverage and frequent revisit. A key challenge is to bring remote-sensing measurements to a level of long-term consistency and accuracy so that they can be efficiently combined in models to reduce uncertainties, in synergy with ground-based data.

Bringing tight observational constraints on fossil fuel and land use change emissions will be the biggest challenge for deployment of a policy-relevant integrated carbon observation system. This will require in situ and remotely sensed data at much higher resolution and density than currently achieved for natural fluxes, although over a small land area (cities, industrial sites, power plants), as well as the inclusion of fossil fuel CO2 proxy measurements such as radiocarbon in CO2 and carbon-fuel combustion tracers. Additionally, a policy-relevant carbon monitoring system should also provide mechanisms for reconciling regional top-down (atmosphere-based) and bottom-up (surface-based) flux estimates across the range of spatial and temporal scales relevant to mitigation policies. In addition, uncertainties for each observation data-stream should be assessed. The success of the system will rely on long-term commitments to monitoring, on improved international collaboration to fill gaps in the current observations, on sustained efforts to improve access to the different data streams and make databases interoperable, and on the calibration of each component of the system to agreed-upon international scales.

ContributorsCiais, P. (Author) / Dolman, A. J. (Author) / Bombelli, A. (Author) / Duren, R. (Author) / Peregon, A. (Author) / Rayner, P. J. (Author) / Miller, C. (Author) / Gobron, N. (Author) / Kinderman, G. (Author) / Marland, G. (Author) / Gruber, N. (Author) / Chevallier, F. (Author) / Andres, R. J. (Author) / Balsamo, G. (Author) / Bopp, L. (Author) / Breon, F. -M. (Author) / Broquet, G. (Author) / Dargaville, R. (Author) / Battin, T. J. (Author) / Borges, A. (Author) / Bovensmann, H. (Author) / Buchwitz, M. (Author) / Butler, J. (Author) / Canadell, J. G. (Author) / Cook, R. B. (Author) / DeFries, R. (Author) / Engelen, R. (Author) / Gurney, Kevin (Author) / Heinze, C. (Author) / Heimann, M. (Author) / Held, A. (Author) / Henry, M. (Author) / Law, B. (Author) / Luyssaert, S. (Author) / Miller, J. (Author) / Moriyama, T. (Author) / Moulin, C. (Author) / Myneni, R. (Author) / College of Liberal Arts and Sciences (Contributor)
Created2013-11-30
129539-Thumbnail Image.png
Description

The apolipoprotein E (APOE) e4 allele is the most prevalent genetic risk factor for Alzheimer's disease (AD). Hippocampal volumes are generally smaller in AD patients carrying the e4 allele compared to e4 noncarriers. Here we examined the effect of APOE e4 on hippocampal morphometry in a large imaging database—the Alzheimer's

The apolipoprotein E (APOE) e4 allele is the most prevalent genetic risk factor for Alzheimer's disease (AD). Hippocampal volumes are generally smaller in AD patients carrying the e4 allele compared to e4 noncarriers. Here we examined the effect of APOE e4 on hippocampal morphometry in a large imaging database—the Alzheimer's Disease Neuroimaging Initiative (ADNI). We automatically segmented and constructed hippocampal surfaces from the baseline MR images of 725 subjects with known APOE genotype information including 167 with AD, 354 with mild cognitive impairment (MCI), and 204 normal controls. High-order correspondences between hippocampal surfaces were enforced across subjects with a novel inverse consistent surface fluid registration method. Multivariate statistics consisting of multivariate tensor-based morphometry (mTBM) and radial distance were computed for surface deformation analysis. Using Hotelling's T2 test, we found significant morphological deformation in APOE e4 carriers relative to noncarriers in the entire cohort as well as in the nondemented (pooled MCI and control) subjects, affecting the left hippocampus more than the right, and this effect was more pronounced in e4 homozygotes than heterozygotes. Our findings are consistent with previous studies that showed e4 carriers exhibit accelerated hippocampal atrophy; we extend these findings to a novel measure of hippocampal morphometry. Hippocampal morphometry has significant potential as an imaging biomarker of early stage AD.

ContributorsShi, Jie (Author) / Lepore, Natasha (Author) / Gutman, Boris A. (Author) / Thompson, Paul M. (Author) / Baxter, Leslie C. (Author) / Caselli, Richard J. (Author) / Wang, Yalin (Author) / Ira A. Fulton Schools of Engineering (Contributor)
Created2014-08-01
129465-Thumbnail Image.png
Description

Mild Cognitive Impairment (MCI) is a transitional stage between normal aging and dementia and people with MCI are at high risk of progression to dementia. MCI is attracting increasing attention, as it offers an opportunity to target the disease process during an early symptomatic stage. Structural magnetic resonance imaging (MRI)

Mild Cognitive Impairment (MCI) is a transitional stage between normal aging and dementia and people with MCI are at high risk of progression to dementia. MCI is attracting increasing attention, as it offers an opportunity to target the disease process during an early symptomatic stage. Structural magnetic resonance imaging (MRI) measures have been the mainstay of Alzheimer's disease (AD) imaging research, however, ventricular morphometry analysis remains challenging because of its complicated topological structure. Here we describe a novel ventricular morphometry system based on the hyperbolic Ricci flow method and tensor-based morphometry (TBM) statistics. Unlike prior ventricular surface parameterization methods, hyperbolic conformal parameterization is angle-preserving and does not have any singularities. Our system generates a one-to-one diffeomorphic mapping between ventricular surfaces with consistent boundary matching conditions. The TBM statistics encode a great deal of surface deformation information that could be inaccessible or overlooked by other methods. We applied our system to the baseline MRI scans of a set of MCI subjects from the Alzheimer's Disease Neuroimaging Initiative (ADNI: 71 MCI converters vs. 62 MCI stable). Although the combined ventricular area and volume features did not differ between the two groups, our fine-grained surface analysis revealed significant differences in the ventricular regions close to the temporal lobe and posterior cingulate, structures that are affected early in AD. Significant correlations were also detected between ventricular morphometry, neuropsychological measures, and a previously described imaging index based on fluorodeoxyglucose positron emission tomography (FDG-PET) scans. This novel ventricular morphometry method may offer a new and more sensitive approach to study preclinical and early symptomatic stage AD.

ContributorsShi, Jie (Author) / Stonnington, Cynthia M. (Author) / Thompson, Paul M. (Author) / Chen, Kewei (Author) / Gutman, Boris (Author) / Reschke, Cole (Author) / Baxter, Leslie C. (Author) / Reiman, Eric M. (Author) / Caselli, Richard J. (Author) / Wang, Yalin (Author) / Ira A. Fulton Schools of Engineering (Contributor)
Created2015-01-01
129478-Thumbnail Image.png
Description

Errors in the specification or utilization of fossil fuel CO2 emissions within carbon budget or atmospheric CO2 inverse studies can alias the estimation of biospheric and oceanic carbon exchange. A key component in the simulation of CO2 concentrations arising from fossil fuel emissions is the spatial distribution of the emission

Errors in the specification or utilization of fossil fuel CO2 emissions within carbon budget or atmospheric CO2 inverse studies can alias the estimation of biospheric and oceanic carbon exchange. A key component in the simulation of CO2 concentrations arising from fossil fuel emissions is the spatial distribution of the emission near coastlines. Regridding of fossil fuel CO2 emissions (FFCO2) from fine to coarse grids to enable atmospheric transport simulations can give rise to mismatches between the emissions and simulated atmospheric dynamics which differ over land or water. For example, emissions originally emanating from the land are emitted from a grid cell for which the vertical mixing reflects the roughness and/or surface energy exchange of an ocean surface. We test this potential "dynamical inconsistency" by examining simulated global atmospheric CO2 concentration driven by two different approaches to regridding fossil fuel CO2 emissions. The two approaches are as follows: (1) a commonly used method that allocates emissions to grid cells with no attempt to ensure dynamical consistency with atmospheric transport and (2) an improved method that reallocates emissions to grid cells to ensure dynamically consistent results. Results show large spatial and temporal differences in the simulated CO2 concentration when comparing these two approaches. The emissions difference ranges from −30.3 TgC grid cell-1 yr-1 (−3.39 kgC m-2 yr-1) to +30.0 TgC grid cell-1 yr-1 (+2.6 kgC m-2 yr-1) along coastal margins. Maximum simulated annual mean CO2 concentration differences at the surface exceed ±6 ppm at various locations and times. Examination of the current CO2 monitoring locations during the local afternoon, consistent with inversion modeling system sampling and measurement protocols, finds maximum hourly differences at 38 stations exceed ±0.10 ppm with individual station differences exceeding −32 ppm. The differences implied by not accounting for this dynamical consistency problem are largest at monitoring sites proximal to large coastal urban areas and point sources. These results suggest that studies comparing simulated to observed atmospheric CO2 concentration, such as atmospheric CO2 inversions, must take measures to correct for this potential problem and ensure flux and dynamical consistency.

ContributorsZhang, X. (Author) / Gurney, Kevin (Author) / Rayner, P. (Author) / Liu, Y. (Author) / Asefi-Najafabady, Salvi (Author) / College of Liberal Arts and Sciences (Contributor)
Created2013-11-30
Description

Background: Cancer diagnosis in both dogs and humans is complicated by the lack of a non-invasive diagnostic test. To meet this clinical need, we apply the recently developed immunosignature assay to spontaneous canine lymphoma as clinical proof-of-concept. Here we evaluate the immunosignature as a diagnostic for spontaneous canine lymphoma at both

Background: Cancer diagnosis in both dogs and humans is complicated by the lack of a non-invasive diagnostic test. To meet this clinical need, we apply the recently developed immunosignature assay to spontaneous canine lymphoma as clinical proof-of-concept. Here we evaluate the immunosignature as a diagnostic for spontaneous canine lymphoma at both at initial diagnosis and evaluating the disease free interval following treatment.

Methods: Sera from dogs with confirmed lymphoma (B cell n = 38, T cell n = 11) and clinically normal dogs (n = 39) were analyzed. Serum antibody responses were characterized by analyzing the binding pattern, or immunosignature, of serum antibodies on a non-natural sequence peptide microarray. Peptides were selected and tested for the ability to distinguish healthy dogs from those with lymphoma and to distinguish lymphoma subtypes based on immunophenotype. The immunosignature of dogs with lymphoma were evaluated for individual signatures. Changes in the immunosignatures were evaluated following treatment and eventual relapse.

Results: Despite being a clonal disease, both an individual immunosignature and a generalized lymphoma immunosignature were observed in each dog. The general lymphoma immunosignature identified in the initial set of dogs (n = 32) was able to predict disease status in an independent set of dogs (n = 42, 97% accuracy). A separate immunosignature was able to distinguish the lymphoma based on immunophenotype (n = 25, 88% accuracy). The individual immunosignature was capable of confirming remission three months following diagnosis. Immunosignature at diagnosis was able to predict which dogs with B cell lymphoma would relapse in less than 120 days (n = 33, 97% accuracy).

Conclusion: We conclude that the immunosignature can serve as a multilevel diagnostic for canine, and potentially human, lymphoma.

ContributorsJohnston, Stephen (Author) / Thamm, Douglas H. (Author) / Legutki, Joseph Barten (Author) / Biodesign Institute (Contributor)
Created2014-09-08
129259-Thumbnail Image.png
Description

What's a profession without a code of ethics? Being a legitimate profession almost requires drafting a code and, at least nominally, making members follow it. Codes of ethics (henceforth “codes”) exist for a number of reasons, many of which can vary widely from profession to profession - but above all

What's a profession without a code of ethics? Being a legitimate profession almost requires drafting a code and, at least nominally, making members follow it. Codes of ethics (henceforth “codes”) exist for a number of reasons, many of which can vary widely from profession to profession - but above all they are a form of codified self-regulation. While codes can be beneficial, it argues that when we scratch below the surface, there are many problems at their root. In terms of efficacy, codes can serve as a form of ethical window dressing, rather than effective rules for behavior. But even more that, codes can degrade the meaning behind being a good person who acts ethically for the right reasons.

Created2013-11-30
Description

High-resolution, global quantification of fossil fuel CO[subscript 2] emissions is emerging as a critical need in carbon cycle science and climate policy. We build upon a previously developed fossil fuel data assimilation system (FFDAS) for estimating global high-resolution fossil fuel CO[subscript 2] emissions. We have improved the underlying observationally based

High-resolution, global quantification of fossil fuel CO[subscript 2] emissions is emerging as a critical need in carbon cycle science and climate policy. We build upon a previously developed fossil fuel data assimilation system (FFDAS) for estimating global high-resolution fossil fuel CO[subscript 2] emissions. We have improved the underlying observationally based data sources, expanded the approach through treatment of separate emitting sectors including a new pointwise database of global power plants, and extended the results to cover a 1997 to 2010 time series at a spatial resolution of 0.1°. Long-term trend analysis of the resulting global emissions shows subnational spatial structure in large active economies such as the United States, China, and India. These three countries, in particular, show different long-term trends and exploration of the trends in nighttime lights, and population reveal a decoupling of population and emissions at the subnational level. Analysis of shorter-term variations reveals the impact of the 2008–2009 global financial crisis with widespread negative emission anomalies across the U.S. and Europe. We have used a center of mass (CM) calculation as a compact metric to express the time evolution of spatial patterns in fossil fuel CO[subscript 2] emissions. The global emission CM has moved toward the east and somewhat south between 1997 and 2010, driven by the increase in emissions in China and South Asia over this time period. Analysis at the level of individual countries reveals per capita CO[subscript 2] emission migration in both Russia and India. The per capita emission CM holds potential as a way to succinctly analyze subnational shifts in carbon intensity over time. Uncertainties are generally lower than the previous version of FFDAS due mainly to an improved nightlight data set.

ContributorsAsefi-Najafabady, Salvi (Author) / Rayner, P. J. (Author) / Gurney, Kevin (Author) / McRobert, A. (Author) / Song, Y. (Author) / Coltin, K. (Author) / Huang, J. (Author) / Elvidge, C. (Author) / Baugh, K. (Author) / College of Liberal Arts and Sciences (Contributor)
Created2014-09-16
128778-Thumbnail Image.png
Description

Online communities are becoming increasingly important as platforms for large-scale human cooperation. These communities allow users seeking and sharing professional skills to solve problems collaboratively. To investigate how users cooperate to complete a large number of knowledge-producing tasks, we analyze Stack Exchange, one of the largest question and answer systems

Online communities are becoming increasingly important as platforms for large-scale human cooperation. These communities allow users seeking and sharing professional skills to solve problems collaboratively. To investigate how users cooperate to complete a large number of knowledge-producing tasks, we analyze Stack Exchange, one of the largest question and answer systems in the world. We construct attention networks to model the growth of 110 communities in the Stack Exchange system and quantify individual answering strategies using the linking dynamics on attention networks. We identify two answering strategies. Strategy A aims at performing maintenance by doing simple tasks, whereas strategy B aims at investing time in doing challenging tasks. Both strategies are important: empirical evidence shows that strategy A decreases the median waiting time for answers and strategy B increases the acceptance rate of answers. In investigating the strategic persistence of users, we find that users tends to stick on the same strategy over time in a community, but switch from one strategy to the other across communities. This finding reveals the different sets of knowledge and skills between users. A balance between the population of users taking A and B strategies that approximates 2:1, is found to be optimal to the sustainable growth of communities.

ContributorsWu, Lingfei (Author) / Baggio, Jacopo (Author) / Janssen, Marco (Author) / ASU-SFI Center for Biosocial Complex Systems (Contributor)
Created2016-03-02
128871-Thumbnail Image.png
Description

Antigen-antibody complexes are central players in an effective immune response. However, finding those interactions relevant to a particular disease state can be arduous. Nonetheless many paths to discovery have been explored since deciphering these interactions can greatly facilitate the development of new diagnostics, therapeutics, and vaccines. In silico B cell

Antigen-antibody complexes are central players in an effective immune response. However, finding those interactions relevant to a particular disease state can be arduous. Nonetheless many paths to discovery have been explored since deciphering these interactions can greatly facilitate the development of new diagnostics, therapeutics, and vaccines. In silico B cell epitope mapping approaches have been widely pursued, though success has not been consistent. Antibody mixtures in immune sera have been used as handles for biologically relevant antigens, but these and other experimental approaches have proven resource intensive and time consuming. In addition, these methods are often tailored to individual diseases or a specific proteome, rather than providing a universal platform. Most of these methods are not able to identify the specific antibody’s epitopes from unknown antigens, such as un-annotated neo antigens in cancer. Alternatively, a peptide library comprised of sequences unrestricted by naturally-found protein space provides for a universal search for mimotopes of an antibody’s epitope. Here we present the utility of such a non-natural random sequence library of 10,000 peptides physically addressed on a microarray for mimotope discovery without sequence information of the specific antigen. The peptide arrays were probed with serum from an antigen-immunized rabbit, or alternatively probed with serum pre-absorbed with the same immunizing antigen. With this positive and negative screening scheme, we identified the library-peptides as the mimotopes of the antigen. The unique library peptides were successfully used to isolate antigen-specific antibodies from complete immune serum. Sequence analysis of these peptides revealed the epitopes in the immunized antigen. We present this method as an inexpensive, efficient method for identifying mimotopes of any antibody’s targets. These mimotopes should be useful in defining both components of the antigen-antibody complex.

ContributorsWhittemore, Kurt (Author) / Johnston, Stephen (Author) / Sykes, Kathryn (Author) / Shen, Luhui (Author) / Biodesign Institute (Contributor)
Created2016-06-14