This growing collection consists of scholarly works authored by ASU-affiliated faculty, staff, and community members, and it contains many open access articles. ASU-affiliated authors are encouraged to Share Your Work in KEEP.

Displaying 1 - 10 of 27
Filtering by

Clear all filters

141461-Thumbnail Image.png
Description
In the digital humanities, there is a constant need to turn images and PDF files into plain text to apply analyses such as topic modelling, named entity recognition, and other techniques. However, although there exist different solutions to extract text embedded in PDF files or run OCR on images, they

In the digital humanities, there is a constant need to turn images and PDF files into plain text to apply analyses such as topic modelling, named entity recognition, and other techniques. However, although there exist different solutions to extract text embedded in PDF files or run OCR on images, they typically require additional training (for example, scholars have to learn how to use the command line) or are difficult to automate without programming skills. The Giles Ecosystem is a distributed system based on Apache Kafka that allows users to upload documents for text and image extraction. The system components are implemented using Java and the Spring Framework and are available under an Open Source license on GitHub (https://github.com/diging/).
ContributorsLessios-Damerow, Julia (Contributor) / Peirson, Erick (Contributor) / Laubichler, Manfred (Contributor) / ASU-SFI Center for Biosocial Complex Systems (Contributor)
Created2017-09-28
141505-Thumbnail Image.png
Description

High proportions of autistic children suffer from gastrointestinal (GI) disorders, implying a link between autism and abnormalities in gut microbial functions. Increasing evidence from recent high-throughput sequencing analyses indicates that disturbances in composition and diversity of gut microbiome are associated with various disease conditions. However, microbiome-level studies on autism are

High proportions of autistic children suffer from gastrointestinal (GI) disorders, implying a link between autism and abnormalities in gut microbial functions. Increasing evidence from recent high-throughput sequencing analyses indicates that disturbances in composition and diversity of gut microbiome are associated with various disease conditions. However, microbiome-level studies on autism are limited and mostly focused on pathogenic bacteria. Therefore, here we aimed to define systemic changes in gut microbiome associated with autism and autism-related GI problems. We recruited 20 neurotypical and 20 autistic children accompanied by a survey of both autistic severity and GI symptoms. By pyrosequencing the V2/V3 regions in bacterial 16S rDNA from fecal DNA samples, we compared gut microbiomes of GI symptom-free neurotypical children with those of autistic children mostly presenting GI symptoms. Unexpectedly, the presence of autistic symptoms, rather than the severity of GI symptoms, was associated with less diverse gut microbiomes. Further, rigorous statistical tests with multiple testing corrections showed significantly lower abundances of the genera Prevotella, Coprococcus, and unclassified Veillonellaceae in autistic samples. These are intriguingly versatile carbohydrate-degrading and/or fermenting bacteria, suggesting a potential influence of unusual diet patterns observed in autistic children. However, multivariate analyses showed that autism-related changes in both overall diversity and individual genus abundances were correlated with the presence of autistic symptoms but not with their diet patterns. Taken together, autism and accompanying GI symptoms were characterized by distinct and less diverse gut microbial compositions with lower levels of Prevotella, Coprococcus, and unclassified Veillonellaceae.

ContributorsKang, Dae Wook (Author) / Park, Jin (Author) / Ilhan, Zehra (Author) / Wallstrom, Garrick (Author) / LaBaer, Joshua (Author) / Adams, James (Author) / Krajmalnik-Brown, Rosa (Author) / Biodesign Institute (Contributor)
Created2013-06-03
Description

Throughout the long history of virus-host co-evolution, viruses have developed delicate strategies to facilitate their invasion and replication of their genome, while silencing the host immune responses through various mechanisms. The systematic characterization of viral protein-host interactions would yield invaluable information in the understanding of viral invasion/evasion, diagnosis and therapeutic

Throughout the long history of virus-host co-evolution, viruses have developed delicate strategies to facilitate their invasion and replication of their genome, while silencing the host immune responses through various mechanisms. The systematic characterization of viral protein-host interactions would yield invaluable information in the understanding of viral invasion/evasion, diagnosis and therapeutic treatment of a viral infection, and mechanisms of host biology. With more than 2,000 viral genomes sequenced, only a small percent of them are well investigated. The access of these viral open reading frames (ORFs) in a flexible cloning format would greatly facilitate both in vitro and in vivo virus-host interaction studies. However, the overall progress of viral ORF cloning has been slow. To facilitate viral studies, we are releasing the initiation of our panviral proteome collection of 2,035 ORF clones from 830 viral genes in the Gateway® recombinational cloning system. Here, we demonstrate several uses of our viral collection including highly efficient production of viral proteins using human cell-free expression system in vitro, global identification of host targets for rubella virus using Nucleic Acid Programmable Protein Arrays (NAPPA) containing 10,000 unique human proteins, and detection of host serological responses using micro-fluidic multiplexed immunoassays. The studies presented here begin to elucidate host-viral protein interactions with our systemic utilization of viral ORFs, high-throughput cloning, and proteomic technologies. These valuable plasmid resources will be available to the research community to enable continued viral functional studies.

ContributorsYu, Xiaobo (Author) / Bian, Xiaofang (Author) / Throop, Andrea (Author) / Song, Lusheng (Author) / del Moral, Lerys (Author) / Park, Jin (Author) / Seiler, Catherine (Author) / Fiacco, Michael (Author) / Steel, Jason (Author) / Hunter, Preston (Author) / Saul, Justin (Author) / Wang, Jie (Author) / Qiu, Ji (Author) / Pipas, James M. (Author) / LaBaer, Joshua (Author) / Biodesign Institute (Contributor)
Created2013-11-30
128166-Thumbnail Image.png
Description

At the end of the dark ages, anatomy was taught as though everything that could be known was known. Scholars learned about what had been discovered rather than how to make discoveries. This was true even though the body (and the rest of biology) was very poorly understood. The renaissance

At the end of the dark ages, anatomy was taught as though everything that could be known was known. Scholars learned about what had been discovered rather than how to make discoveries. This was true even though the body (and the rest of biology) was very poorly understood. The renaissance eventually brought a revolution in how scholars (and graduate students) were trained and worked. This revolution never occurred in K-12 or university education such that we now teach young students in much the way that scholars were taught in the dark ages, we teach them what is already known rather than the process of knowing. Citizen science offers a way to change K-12 and university education and, in doing so, complete the renaissance. Here we offer an example of such an approach and call for change in the way students are taught science, change that is more possible than it has ever been and is, nonetheless, five hundred years delayed.

Created2016-03-01
127872-Thumbnail Image.png
Description

Background: Modern advances in sequencing technology have enabled the census of microbial members of many natural ecosystems. Recently, attention is increasingly being paid to the microbial residents of human-made, built ecosystems, both private (homes) and public (subways, office buildings, and hospitals). Here, we report results of the characterization of the microbial

Background: Modern advances in sequencing technology have enabled the census of microbial members of many natural ecosystems. Recently, attention is increasingly being paid to the microbial residents of human-made, built ecosystems, both private (homes) and public (subways, office buildings, and hospitals). Here, we report results of the characterization of the microbial ecology of a singular built environment, the International Space Station (ISS). This ISS sampling involved the collection and microbial analysis (via 16S rRNA gene PCR) of 15 surfaces sampled by swabs onboard the ISS. This sampling was a component of Project MERCCURI (Microbial Ecology Research Combining Citizen and University Researchers on ISS). Learning more about the microbial inhabitants of the “buildings” in which we travel through space will take on increasing importance, as plans for human exploration continue, with the possibility of colonization of other planets and moons.

Results: Sterile swabs were used to sample 15 surfaces onboard the ISS. The sites sampled were designed to be analogous to samples collected for (1) the Wildlife of Our Homes project and (2) a study of cell phones and shoes that were concurrently being collected for another component of Project MERCCURI. Sequencing of the 16S rRNA genes amplified from DNA extracted from each swab was used to produce a census of the microbes present on each surface sampled. We compared the microbes found on the ISS swabs to those from both homes on Earth and data from the Human Microbiome Project.

Conclusions: While significantly different from homes on Earth and the Human Microbiome Project samples analyzed here, the microbial community composition on the ISS was more similar to home surfaces than to the human microbiome samples. The ISS surfaces are OTU-rich with 1,036–4,294 operational taxonomic units (OTUs per sample). There was no discernible biogeography of microbes on the 15 ISS surfaces, although this may be a reflection of the small sample size we were able to obtain.

ContributorsLang, Jenna M. (Author) / Coil, David A. (Author) / Neches, Russell Y. (Author) / Brown, Wendy E. (Author) / Cavalier, Darlene (Author) / Severance, Mark (Author) / Hampton-Marcell, Jarrad T. (Author) / Gilbert, Jack A. (Author) / Eisen, Jonathan A. (Author) / ASU-SFI Center for Biosocial Complex Systems (Contributor)
Created2017-12-05
127868-Thumbnail Image.png
Description

Rationale: Cell-free protein microarrays display naturally-folded proteins based on just-in-time in situ synthesis, and have made important contributions to basic and translational research. However, the risk of spot-to-spot cross-talk from protein diffusion during expression has limited the feature density of these arrays.

Methods: In this work, we developed the Multiplexed Nucleic

Rationale: Cell-free protein microarrays display naturally-folded proteins based on just-in-time in situ synthesis, and have made important contributions to basic and translational research. However, the risk of spot-to-spot cross-talk from protein diffusion during expression has limited the feature density of these arrays.

Methods: In this work, we developed the Multiplexed Nucleic Acid Programmable Protein Array (M-NAPPA), which significantly increases the number of displayed proteins by multiplexing as many as five different gene plasmids within a printed spot.

Results: Even when proteins of different sizes were displayed within the same feature, they were readily detected using protein-specific antibodies. Protein-protein interactions and serological antibody assays using human viral proteome microarrays demonstrated that comparable hits were detected by M-NAPPA and non-multiplexed NAPPA arrays. An ultra-high density proteome microarray displaying > 16k proteins on a single microscope slide was produced by combining M-NAPPA with a photolithography-based silicon nano-well platform. Finally, four new tuberculosis-related antigens in guinea pigs vaccinated with Bacillus Calmette-Guerin (BCG) were identified with M-NAPPA and validated with ELISA.

Conclusion: All data demonstrate that multiplexing features on a protein microarray offer a cost-effective fabrication approach and have the potential to facilitate high throughput translational research.

ContributorsYu, Xiaobo (Author) / Song, Lusheng (Author) / Petritis, Brianne (Author) / Bian, Xiaofang (Author) / Wang, Haoyu (Author) / Viloria, Jennifer (Author) / Park, Jin (Author) / Bui, Hoang (Author) / Li, Han (Author) / Wang, Jie (Author) / Liu, Lei (Author) / Yang, Liuhui (Author) / Duan, Hu (Author) / McMurray, David N. (Author) / Achkar, Jacqueline M. (Author) / Magee, Mitch (Author) / Qiu, Ji (Author) / LaBaer, Joshua (Author) / Biodesign Institute (Contributor)
Created2017-09-20
129310-Thumbnail Image.png
Description

Sera from patients with ovarian cancer contain autoantibodies (AAb) to tumor-derived proteins that are potential biomarkers for early detection. To detect AAb, we probed high-density programmable protein microarrays (NAPPA) expressing 5177 candidate tumor antigens with sera from patients with serous ovarian cancer (n = 34 cases/30 controls) and measured bound

Sera from patients with ovarian cancer contain autoantibodies (AAb) to tumor-derived proteins that are potential biomarkers for early detection. To detect AAb, we probed high-density programmable protein microarrays (NAPPA) expressing 5177 candidate tumor antigens with sera from patients with serous ovarian cancer (n = 34 cases/30 controls) and measured bound IgG. Of these, 741 antigens were selected and probed with an independent set of ovarian cancer sera (n = 60 cases/60 controls). Twelve potential autoantigens were identified with sensitivities ranging from 13 to 22% at >93% specificity. These were retested using a Luminex bead array using 60 cases and 60 controls, with sensitivities ranging from 0 to 31.7% at 95% specificity. Three AAb (p53, PTPRA, and PTGFR) had area under the curve (AUC) levels >60% (p < 0.01), with the partial AUC (SPAUC) over 5 times greater than for a nondiscriminating test (p < 0.01). Using a panel of the top three AAb (p53, PTPRA, and PTGFR), if at least two AAb were positive, then the sensitivity was 23.3% at 98.3% specificity. AAb to at least one of these top three antigens were also detected in 7/20 sera (35%) of patients with low CA 125 levels and 0/15 controls. AAb to p53, PTPRA, and PTGFR are potential biomarkers for the early detection of ovarian cancer.

ContributorsAnderson, Karen (Author) / Cramer, Daniel W. (Author) / Sibani, Sahar (Author) / Wallstrom, Garrick (Author) / Wong, Jessica (Author) / Park, Jin (Author) / Qiu, Ji (Author) / Vitonis, Allison (Author) / LaBaer, Joshua (Author) / Biodesign Institute (Contributor)
Created2015-01-01
128816-Thumbnail Image.png
Description

To address the need to study frozen clinical specimens using next-generation RNA, DNA, chromatin immunoprecipitation (ChIP) sequencing and protein analyses, we developed a biobank work flow to prospectively collect biospecimens from patients with renal cell carcinoma (RCC). We describe our standard operating procedures and work flow to annotate pathologic results

To address the need to study frozen clinical specimens using next-generation RNA, DNA, chromatin immunoprecipitation (ChIP) sequencing and protein analyses, we developed a biobank work flow to prospectively collect biospecimens from patients with renal cell carcinoma (RCC). We describe our standard operating procedures and work flow to annotate pathologic results and clinical outcomes. We report quality control outcomes and nucleic acid yields of our RCC submissions (N=16) to The Cancer Genome Atlas (TCGA) project, as well as newer discovery platforms, by describing mass spectrometry analysis of albumin oxidation in plasma and 6 ChIP sequencing libraries generated from nephrectomy specimens after histone H3 lysine 36 trimethylation (H3K36me3) immunoprecipitation. From June 1, 2010, through January 1, 2013, we enrolled 328 patients with RCC. Our mean (SD) TCGA RNA integrity numbers (RINs) were 8.1 (0.8) for papillary RCC, with a 12.5% overall rate of sample disqualification for RIN <7. Banked plasma had significantly less albumin oxidation (by mass spectrometry analysis) than plasma kept at 25°C (P<.001). For ChIP sequencing, the FastQC score for average read quality was at least 30 for 91% to 95% of paired-end reads. In parallel, we analyzed frozen tissue by RNA sequencing; after genome alignment, only 0.2% to 0.4% of total reads failed the default quality check steps of Bowtie2, which was comparable to the disqualification ratio (0.1%) of the 786-O RCC cell line that was prepared under optimal RNA isolation conditions. The overall correlation coefficients for gene expression between Mayo Clinic vs TCGA tissues ranged from 0.75 to 0.82. These data support the generation of high-quality nucleic acids for genomic analyses from banked RCC. Importantly, the protocol does not interfere with routine clinical care. Collections over defined time points during disease treatment further enhance collaborative efforts to integrate genomic information with outcomes.

ContributorsHo, Thai H. (Author) / Nunez Nateras, Rafael (Author) / Yan, Huihuang (Author) / Park, Jin (Author) / Jensen, Sally (Author) / Borges, Chad (Author) / Lee, Jeong Heon (Author) / Champion, Mia D. (Author) / Tibes, Raoul (Author) / Bryce, Alan H. (Author) / Carballido, Estrella M. (Author) / Todd, Mark A. (Author) / Joseph, Richard W. (Author) / Wong, William W. (Author) / Parker, Alexander S. (Author) / Stanton, Melissa L. (Author) / Castle, Erik P. (Author) / Biodesign Institute (Contributor)
Created2015-07-16
128778-Thumbnail Image.png
Description

Online communities are becoming increasingly important as platforms for large-scale human cooperation. These communities allow users seeking and sharing professional skills to solve problems collaboratively. To investigate how users cooperate to complete a large number of knowledge-producing tasks, we analyze Stack Exchange, one of the largest question and answer systems

Online communities are becoming increasingly important as platforms for large-scale human cooperation. These communities allow users seeking and sharing professional skills to solve problems collaboratively. To investigate how users cooperate to complete a large number of knowledge-producing tasks, we analyze Stack Exchange, one of the largest question and answer systems in the world. We construct attention networks to model the growth of 110 communities in the Stack Exchange system and quantify individual answering strategies using the linking dynamics on attention networks. We identify two answering strategies. Strategy A aims at performing maintenance by doing simple tasks, whereas strategy B aims at investing time in doing challenging tasks. Both strategies are important: empirical evidence shows that strategy A decreases the median waiting time for answers and strategy B increases the acceptance rate of answers. In investigating the strategic persistence of users, we find that users tends to stick on the same strategy over time in a community, but switch from one strategy to the other across communities. This finding reveals the different sets of knowledge and skills between users. A balance between the population of users taking A and B strategies that approximates 2:1, is found to be optimal to the sustainable growth of communities.

ContributorsWu, Lingfei (Author) / Baggio, Jacopo (Author) / Janssen, Marco (Author) / ASU-SFI Center for Biosocial Complex Systems (Contributor)
Created2016-03-02
129259-Thumbnail Image.png
Description

What's a profession without a code of ethics? Being a legitimate profession almost requires drafting a code and, at least nominally, making members follow it. Codes of ethics (henceforth “codes”) exist for a number of reasons, many of which can vary widely from profession to profession - but above all

What's a profession without a code of ethics? Being a legitimate profession almost requires drafting a code and, at least nominally, making members follow it. Codes of ethics (henceforth “codes”) exist for a number of reasons, many of which can vary widely from profession to profession - but above all they are a form of codified self-regulation. While codes can be beneficial, it argues that when we scratch below the surface, there are many problems at their root. In terms of efficacy, codes can serve as a form of ethical window dressing, rather than effective rules for behavior. But even more that, codes can degrade the meaning behind being a good person who acts ethically for the right reasons.

Created2013-11-30