This growing collection consists of scholarly works authored by ASU-affiliated faculty, staff, and community members, and it contains many open access articles. ASU-affiliated authors are encouraged to Share Your Work in KEEP.

Displaying 1 - 10 of 58
Filtering by

Clear all filters

141461-Thumbnail Image.png
Description
In the digital humanities, there is a constant need to turn images and PDF files into plain text to apply analyses such as topic modelling, named entity recognition, and other techniques. However, although there exist different solutions to extract text embedded in PDF files or run OCR on images, they

In the digital humanities, there is a constant need to turn images and PDF files into plain text to apply analyses such as topic modelling, named entity recognition, and other techniques. However, although there exist different solutions to extract text embedded in PDF files or run OCR on images, they typically require additional training (for example, scholars have to learn how to use the command line) or are difficult to automate without programming skills. The Giles Ecosystem is a distributed system based on Apache Kafka that allows users to upload documents for text and image extraction. The system components are implemented using Java and the Spring Framework and are available under an Open Source license on GitHub (https://github.com/diging/).
ContributorsLessios-Damerow, Julia (Contributor) / Peirson, Erick (Contributor) / Laubichler, Manfred (Contributor) / ASU-SFI Center for Biosocial Complex Systems (Contributor)
Created2017-09-28
129537-Thumbnail Image.png
Description

There are many proteomic applications that require large collections of purified protein, but parallel production of large numbers of different proteins remains a very challenging task. To help meet the needs of the scientific community, we have developed a human protein production pipeline. Using high-throughput (HT) methods, we transferred the

There are many proteomic applications that require large collections of purified protein, but parallel production of large numbers of different proteins remains a very challenging task. To help meet the needs of the scientific community, we have developed a human protein production pipeline. Using high-throughput (HT) methods, we transferred the genes of 31 full-length proteins into three expression vectors, and expressed the collection as N-terminal HaloTag fusion proteins in Escherichia coli and two commercial cell-free (CF) systems, wheat germ extract (WGE) and HeLa cell extract (HCE). Expression was assessed by labeling the fusion proteins specifically and covalently with a fluorescent HaloTag ligand and detecting its fluorescence on a LabChip[superscript ®] GX microfluidic capillary gel electrophoresis instrument. This automated, HT assay provided both qualitative and quantitative assessment of recombinant protein. E. coli was only capable of expressing 20% of the test collection in the supernatant fraction with ≥20 μg yields, whereas CF systems had ≥83% success rates. We purified expressed proteins using an automated HaloTag purification method. We purified 20, 33, and 42% of the test collection from E. coli, WGE, and HCE, respectively, with yields ≥1 μg and ≥90% purity. Based on these observations, we have developed a triage strategy for producing full-length human proteins in these three expression systems.

ContributorsSaul, Justin (Author) / Petritis, Brianne (Author) / Sau, Sujay (Author) / Rauf, Femina (Author) / Gaskin, Michael (Author) / Ober-Reynolds, Benjamin (Author) / Mineyev, Irina (Author) / Magee, Mitch (Author) / Chaput, John (Author) / Qiu, Ji (Author) / LaBaer, Joshua (Author) / Biodesign Institute (Contributor)
Created2014-08-01
Description

Throughout the long history of virus-host co-evolution, viruses have developed delicate strategies to facilitate their invasion and replication of their genome, while silencing the host immune responses through various mechanisms. The systematic characterization of viral protein-host interactions would yield invaluable information in the understanding of viral invasion/evasion, diagnosis and therapeutic

Throughout the long history of virus-host co-evolution, viruses have developed delicate strategies to facilitate their invasion and replication of their genome, while silencing the host immune responses through various mechanisms. The systematic characterization of viral protein-host interactions would yield invaluable information in the understanding of viral invasion/evasion, diagnosis and therapeutic treatment of a viral infection, and mechanisms of host biology. With more than 2,000 viral genomes sequenced, only a small percent of them are well investigated. The access of these viral open reading frames (ORFs) in a flexible cloning format would greatly facilitate both in vitro and in vivo virus-host interaction studies. However, the overall progress of viral ORF cloning has been slow. To facilitate viral studies, we are releasing the initiation of our panviral proteome collection of 2,035 ORF clones from 830 viral genes in the Gateway® recombinational cloning system. Here, we demonstrate several uses of our viral collection including highly efficient production of viral proteins using human cell-free expression system in vitro, global identification of host targets for rubella virus using Nucleic Acid Programmable Protein Arrays (NAPPA) containing 10,000 unique human proteins, and detection of host serological responses using micro-fluidic multiplexed immunoassays. The studies presented here begin to elucidate host-viral protein interactions with our systemic utilization of viral ORFs, high-throughput cloning, and proteomic technologies. These valuable plasmid resources will be available to the research community to enable continued viral functional studies.

ContributorsYu, Xiaobo (Author) / Bian, Xiaofang (Author) / Throop, Andrea (Author) / Song, Lusheng (Author) / del Moral, Lerys (Author) / Park, Jin (Author) / Seiler, Catherine (Author) / Fiacco, Michael (Author) / Steel, Jason (Author) / Hunter, Preston (Author) / Saul, Justin (Author) / Wang, Jie (Author) / Qiu, Ji (Author) / Pipas, James M. (Author) / LaBaer, Joshua (Author) / Biodesign Institute (Contributor)
Created2013-11-30
129259-Thumbnail Image.png
Description

What's a profession without a code of ethics? Being a legitimate profession almost requires drafting a code and, at least nominally, making members follow it. Codes of ethics (henceforth “codes”) exist for a number of reasons, many of which can vary widely from profession to profession - but above all

What's a profession without a code of ethics? Being a legitimate profession almost requires drafting a code and, at least nominally, making members follow it. Codes of ethics (henceforth “codes”) exist for a number of reasons, many of which can vary widely from profession to profession - but above all they are a form of codified self-regulation. While codes can be beneficial, it argues that when we scratch below the surface, there are many problems at their root. In terms of efficacy, codes can serve as a form of ethical window dressing, rather than effective rules for behavior. But even more that, codes can degrade the meaning behind being a good person who acts ethically for the right reasons.

Created2013-11-30
129278-Thumbnail Image.png
Description

We report a device to fill an array of small chemical reaction chambers (microreactors) with reagent and then seal them using pressurized viscous liquid acting through a flexible membrane. The device enables multiple, independent chemical reactions involving free floating intermediate molecules without interference from neighboring reactions or external environments. The

We report a device to fill an array of small chemical reaction chambers (microreactors) with reagent and then seal them using pressurized viscous liquid acting through a flexible membrane. The device enables multiple, independent chemical reactions involving free floating intermediate molecules without interference from neighboring reactions or external environments. The device is validated by protein expressed in situ directly from DNA in a microarray of ~10,000 spots with no diffusion during three hours incubation. Using the device to probe for an autoantibody cancer biomarker in blood serum sample gave five times higher signal to background ratio compared to standard protein microarray expressed on a flat microscope slide. Physical design principles to effectively fill the array of microreactors with reagent and experimental results of alternate methods for sealing the microreactors are presented.

ContributorsWiktor, Peter (Author) / Brunner, Al (Author) / Kahn, Peter (Author) / Qiu, Ji (Author) / Magee, Mitch (Author) / Bian, Xiaofang (Author) / Karthikeyan, Kailash (Author) / LaBaer, Joshua (Author) / Biodesign Institute (Contributor)
Created2015-03-04
129310-Thumbnail Image.png
Description

Sera from patients with ovarian cancer contain autoantibodies (AAb) to tumor-derived proteins that are potential biomarkers for early detection. To detect AAb, we probed high-density programmable protein microarrays (NAPPA) expressing 5177 candidate tumor antigens with sera from patients with serous ovarian cancer (n = 34 cases/30 controls) and measured bound

Sera from patients with ovarian cancer contain autoantibodies (AAb) to tumor-derived proteins that are potential biomarkers for early detection. To detect AAb, we probed high-density programmable protein microarrays (NAPPA) expressing 5177 candidate tumor antigens with sera from patients with serous ovarian cancer (n = 34 cases/30 controls) and measured bound IgG. Of these, 741 antigens were selected and probed with an independent set of ovarian cancer sera (n = 60 cases/60 controls). Twelve potential autoantigens were identified with sensitivities ranging from 13 to 22% at >93% specificity. These were retested using a Luminex bead array using 60 cases and 60 controls, with sensitivities ranging from 0 to 31.7% at 95% specificity. Three AAb (p53, PTPRA, and PTGFR) had area under the curve (AUC) levels >60% (p < 0.01), with the partial AUC (SPAUC) over 5 times greater than for a nondiscriminating test (p < 0.01). Using a panel of the top three AAb (p53, PTPRA, and PTGFR), if at least two AAb were positive, then the sensitivity was 23.3% at 98.3% specificity. AAb to at least one of these top three antigens were also detected in 7/20 sera (35%) of patients with low CA 125 levels and 0/15 controls. AAb to p53, PTPRA, and PTGFR are potential biomarkers for the early detection of ovarian cancer.

ContributorsAnderson, Karen (Author) / Cramer, Daniel W. (Author) / Sibani, Sahar (Author) / Wallstrom, Garrick (Author) / Wong, Jessica (Author) / Park, Jin (Author) / Qiu, Ji (Author) / Vitonis, Allison (Author) / LaBaer, Joshua (Author) / Biodesign Institute (Contributor)
Created2015-01-01
128778-Thumbnail Image.png
Description

Online communities are becoming increasingly important as platforms for large-scale human cooperation. These communities allow users seeking and sharing professional skills to solve problems collaboratively. To investigate how users cooperate to complete a large number of knowledge-producing tasks, we analyze Stack Exchange, one of the largest question and answer systems

Online communities are becoming increasingly important as platforms for large-scale human cooperation. These communities allow users seeking and sharing professional skills to solve problems collaboratively. To investigate how users cooperate to complete a large number of knowledge-producing tasks, we analyze Stack Exchange, one of the largest question and answer systems in the world. We construct attention networks to model the growth of 110 communities in the Stack Exchange system and quantify individual answering strategies using the linking dynamics on attention networks. We identify two answering strategies. Strategy A aims at performing maintenance by doing simple tasks, whereas strategy B aims at investing time in doing challenging tasks. Both strategies are important: empirical evidence shows that strategy A decreases the median waiting time for answers and strategy B increases the acceptance rate of answers. In investigating the strategic persistence of users, we find that users tends to stick on the same strategy over time in a community, but switch from one strategy to the other across communities. This finding reveals the different sets of knowledge and skills between users. A balance between the population of users taking A and B strategies that approximates 2:1, is found to be optimal to the sustainable growth of communities.

ContributorsWu, Lingfei (Author) / Baggio, Jacopo (Author) / Janssen, Marco (Author) / ASU-SFI Center for Biosocial Complex Systems (Contributor)
Created2016-03-02
128998-Thumbnail Image.png
Description

Background: While prior studies have quantified the mortality burden of the 1957 H2N2 influenza pandemic at broad geographic regions in the United States, little is known about the pandemic impact at a local level. Here we focus on analyzing the transmissibility and mortality burden of this pandemic in Arizona, a setting

Background: While prior studies have quantified the mortality burden of the 1957 H2N2 influenza pandemic at broad geographic regions in the United States, little is known about the pandemic impact at a local level. Here we focus on analyzing the transmissibility and mortality burden of this pandemic in Arizona, a setting where the dry climate was promoted as reducing respiratory illness transmission yet tuberculosis prevalence was high.

Methods: Using archival death certificates from 1954 to 1961, we quantified the age-specific seasonal patterns, excess-mortality rates, and transmissibility patterns of the 1957 H2N2 pandemic in Maricopa County, Arizona. By applying cyclical Serfling linear regression models to weekly mortality rates, the excess-mortality rates due to respiratory and all-causes were estimated for each age group during the pandemic period. The reproduction number was quantified from weekly data using a simple growth rate method and assumed generation intervals of 3 and 4 days. Local newspaper articles published during 1957–1958 were also examined.

Results: Excess-mortality rates varied between waves, age groups, and causes of death, but overall remained low. From October 1959-June 1960, the most severe wave of the pandemic, the absolute excess-mortality rate based on respiratory deaths per 10,000 population was 16.59 in the elderly (≥65 years). All other age groups exhibit very low excess-mortality and the typical U-shaped age-pattern was absent. However, the standardized mortality ratio was greatest (4.06) among children and young adolescents (5–14 years) from October 1957-March 1958, based on mortality rates of respiratory deaths. Transmissibility was greatest during the same 1957–1958 period, when the mean reproduction number was estimated at 1.08–1.11, assuming 3- or 4-day generation intervals with exponential or fixed distributions.

Conclusions: Maricopa County exhibited very low mortality impact associated with the 1957 influenza pandemic. Understanding the relatively low excess-mortality rates and transmissibility in Maricopa County during this historic pandemic may help public health officials prepare for and mitigate future outbreaks of influenza.

ContributorsCobos, April (Author) / Nelson, Clinton (Author) / Jehn, Megan (Author) / Viboud, Cecile (Author) / Chowell-Puente, Gerardo (Author) / College of Liberal Arts and Sciences (Contributor)
Created2016-08-11
128953-Thumbnail Image.png
Description

Background: On 31 March 2013, the first human infections with the novel influenza A/H7N9 virus were reported in Eastern China. The outbreak expanded rapidly in geographic scope and size, with a total of 132 laboratory-confirmed cases reported by 3 June 2013, in 10 Chinese provinces and Taiwan. The incidence of A/H7N9

Background: On 31 March 2013, the first human infections with the novel influenza A/H7N9 virus were reported in Eastern China. The outbreak expanded rapidly in geographic scope and size, with a total of 132 laboratory-confirmed cases reported by 3 June 2013, in 10 Chinese provinces and Taiwan. The incidence of A/H7N9 cases has stalled in recent weeks, presumably as a consequence of live bird market closures in the most heavily affected areas. Here we compare the transmission potential of influenza A/H7N9 with that of other emerging pathogens and evaluate the impact of intervention measures in an effort to guide pandemic preparedness.

Methods: We used a Bayesian approach combined with a SEIR (Susceptible-Exposed-Infectious-Removed) transmission model fitted to daily case data to assess the reproduction number (R) of A/H7N9 by province and to evaluate the impact of live bird market closures in April and May 2013. Simulation studies helped quantify the performance of our approach in the context of an emerging pathogen, where human-to-human transmission is limited and most cases arise from spillover events. We also used alternative approaches to estimate R based on individual-level information on prior exposure and compared the transmission potential of influenza A/H7N9 with that of other recent zoonoses.

Results: Estimates of R for the A/H7N9 outbreak were below the epidemic threshold required for sustained human-to-human transmission and remained near 0.1 throughout the study period, with broad 95% credible intervals by the Bayesian method (0.01 to 0.49). The Bayesian estimation approach was dominated by the prior distribution, however, due to relatively little information contained in the case data. We observe a statistically significant deceleration in growth rate after 6 April 2013, which is consistent with a reduction in A/H7N9 transmission associated with the preemptive closure of live bird markets. Although confidence intervals are broad, the estimated transmission potential of A/H7N9 appears lower than that of recent zoonotic threats, including avian influenza A/H5N1, swine influenza H3N2sw and Nipah virus.

Conclusion: Although uncertainty remains high in R estimates for H7N9 due to limited epidemiological information, all available evidence points to a low transmission potential. Continued monitoring of the transmission potential of A/H7N9 is critical in the coming months as intervention measures may be relaxed and seasonal factors could promote disease transmission in colder months.

Created2013-10-02
128959-Thumbnail Image.png
Description

Background: The impact of socio-demographic factors and baseline health on the mortality burden of seasonal and pandemic influenza remains debated. Here we analyzed the spatial-temporal mortality patterns of the 1918 influenza pandemic in Spain, one of the countries of Europe that experienced the highest mortality burden.

Methods: We analyzed monthly death rates from

Background: The impact of socio-demographic factors and baseline health on the mortality burden of seasonal and pandemic influenza remains debated. Here we analyzed the spatial-temporal mortality patterns of the 1918 influenza pandemic in Spain, one of the countries of Europe that experienced the highest mortality burden.

Methods: We analyzed monthly death rates from respiratory diseases and all-causes across 49 provinces of Spain, including the Canary and Balearic Islands, during the period January-1915 to June-1919. We estimated the influenza-related excess death rates and risk of death relative to baseline mortality by pandemic wave and province. We then explored the association between pandemic excess mortality rates and health and socio-demographic factors, which included population size and age structure, population density, infant mortality rates, baseline death rates, and urbanization.

Results: Our analysis revealed high geographic heterogeneity in pandemic mortality impact. We identified 3 pandemic waves of varying timing and intensity covering the period from Jan-1918 to Jun-1919, with the highest pandemic-related excess mortality rates occurring during the months of October-November 1918 across all Spanish provinces. Cumulative excess mortality rates followed a south–north gradient after controlling for demographic factors, with the North experiencing highest excess mortality rates. A model that included latitude, population density, and the proportion of children living in provinces explained about 40% of the geographic variability in cumulative excess death rates during 1918–19, but different factors explained mortality variation in each wave.

Conclusions: A substantial fraction of the variability in excess mortality rates across Spanish provinces remained unexplained, which suggests that other unidentified factors such as comorbidities, climate and background immunity may have affected the 1918-19 pandemic mortality rates. Further archeo-epidemiological research should concentrate on identifying settings with combined availability of local historical mortality records and information on the prevalence of underlying risk factors, or patient-level clinical data, to further clarify the drivers of 1918 pandemic influenza mortality.

Created2014-07-05