This growing collection consists of scholarly works authored by ASU-affiliated faculty, staff, and community members, and it contains many open access articles. ASU-affiliated authors are encouraged to Share Your Work in KEEP.

Displaying 1 - 10 of 43
Filtering by

Clear all filters

141461-Thumbnail Image.png
Description
In the digital humanities, there is a constant need to turn images and PDF files into plain text to apply analyses such as topic modelling, named entity recognition, and other techniques. However, although there exist different solutions to extract text embedded in PDF files or run OCR on images, they

In the digital humanities, there is a constant need to turn images and PDF files into plain text to apply analyses such as topic modelling, named entity recognition, and other techniques. However, although there exist different solutions to extract text embedded in PDF files or run OCR on images, they typically require additional training (for example, scholars have to learn how to use the command line) or are difficult to automate without programming skills. The Giles Ecosystem is a distributed system based on Apache Kafka that allows users to upload documents for text and image extraction. The system components are implemented using Java and the Spring Framework and are available under an Open Source license on GitHub (https://github.com/diging/).
ContributorsLessios-Damerow, Julia (Contributor) / Peirson, Erick (Contributor) / Laubichler, Manfred (Contributor) / ASU-SFI Center for Biosocial Complex Systems (Contributor)
Created2017-09-28
129333-Thumbnail Image.png
Description

MicroRNAs (miRNAs) are short non-coding RNAs that regulate gene output at the post-transcriptional level by targeting degenerate elements primarily in 3′untranslated regions (3′UTRs) of mRNAs. Individual miRNAs can regulate networks of hundreds of genes, yet for the majority of miRNAs few, if any, targets are known. Misexpression of miRNAs is

MicroRNAs (miRNAs) are short non-coding RNAs that regulate gene output at the post-transcriptional level by targeting degenerate elements primarily in 3′untranslated regions (3′UTRs) of mRNAs. Individual miRNAs can regulate networks of hundreds of genes, yet for the majority of miRNAs few, if any, targets are known. Misexpression of miRNAs is also a major contributor to cancer progression, thus there is a critical need to validate miRNA targets in high-throughput to understand miRNAs' contribution to tumorigenesis. Here we introduce a novel high-throughput assay to detect miRNA targets in 3′UTRs, called Luminescent Identification of Functional Elements in 3′UTRs (3′LIFE). We demonstrate the feasibility of 3′LIFE using a data set of 275 human 3′UTRs and two cancer-relevant miRNAs, let-7c and miR-10b, and compare our results to alternative methods to detect miRNA targets throughout the genome. We identify a large number of novel gene targets for these miRNAs, with only 32% of hits being bioinformatically predicted and 27% directed by non-canonical interactions. Functional analysis of target genes reveals consistent roles for each miRNA as either a tumor suppressor (let-7c) or oncogenic miRNA (miR-10b), and preferentially target multiple genes within regulatory networks, suggesting 3′LIFE is a rapid and sensitive method to detect miRNA targets in high-throughput.

ContributorsWolter, Justin (Author) / Kotagama, Kasuen (Author) / Pierre-Bez, Alexandra C. (Author) / Firago, Mari (Author) / Mangone, Marco (Author) / College of Liberal Arts and Sciences (Contributor)
Created2014-09-29
129370-Thumbnail Image.png
Description

Adaptation requires genetic variation, but founder populations are generally genetically depleted. Here we sequence two populations of an inbred ant that diverge in phenotype to determine how variability is generated. Cardiocondyla obscurior has the smallest of the sequenced ant genomes and its structure suggests a fundamental role of transposable elements

Adaptation requires genetic variation, but founder populations are generally genetically depleted. Here we sequence two populations of an inbred ant that diverge in phenotype to determine how variability is generated. Cardiocondyla obscurior has the smallest of the sequenced ant genomes and its structure suggests a fundamental role of transposable elements (TEs) in adaptive evolution. Accumulations of TEs (TE islands) comprising 7.18% of the genome evolve faster than other regions with regard to single-nucleotide variants, gene/exon duplications and deletions and gene homology. A non-random distribution of gene families, larvae/adult specific gene expression and signs of differential methylation in TE islands indicate intragenomic differences in regulation, evolutionary rates and coalescent effective population size. Our study reveals a tripartite interplay between TEs, life history and adaptation in an invasive species.

ContributorsSchrader, Lukas (Author) / Kim, Jay W. (Author) / Ence, Daniel (Author) / Zimin, Aleksey (Author) / Klein, Antonia (Author) / Wyschetzki, Katharina (Author) / Weichselgartner, Tobias (Author) / Kemena, Carsten (Author) / Stoekl, Johannes (Author) / Schultner, Eva (Author) / Wurm, Yannick (Author) / Smith, Christopher D. (Author) / Yandell, Mark (Author) / Heinze, Juergen (Author) / Gadau, Juergen (Author) / Oettler, Jan (Author) / College of Liberal Arts and Sciences (Contributor)
Created2014-12-01
129259-Thumbnail Image.png
Description

What's a profession without a code of ethics? Being a legitimate profession almost requires drafting a code and, at least nominally, making members follow it. Codes of ethics (henceforth “codes”) exist for a number of reasons, many of which can vary widely from profession to profession - but above all

What's a profession without a code of ethics? Being a legitimate profession almost requires drafting a code and, at least nominally, making members follow it. Codes of ethics (henceforth “codes”) exist for a number of reasons, many of which can vary widely from profession to profession - but above all they are a form of codified self-regulation. While codes can be beneficial, it argues that when we scratch below the surface, there are many problems at their root. In terms of efficacy, codes can serve as a form of ethical window dressing, rather than effective rules for behavior. But even more that, codes can degrade the meaning behind being a good person who acts ethically for the right reasons.

Created2013-11-30
128778-Thumbnail Image.png
Description

Online communities are becoming increasingly important as platforms for large-scale human cooperation. These communities allow users seeking and sharing professional skills to solve problems collaboratively. To investigate how users cooperate to complete a large number of knowledge-producing tasks, we analyze Stack Exchange, one of the largest question and answer systems

Online communities are becoming increasingly important as platforms for large-scale human cooperation. These communities allow users seeking and sharing professional skills to solve problems collaboratively. To investigate how users cooperate to complete a large number of knowledge-producing tasks, we analyze Stack Exchange, one of the largest question and answer systems in the world. We construct attention networks to model the growth of 110 communities in the Stack Exchange system and quantify individual answering strategies using the linking dynamics on attention networks. We identify two answering strategies. Strategy A aims at performing maintenance by doing simple tasks, whereas strategy B aims at investing time in doing challenging tasks. Both strategies are important: empirical evidence shows that strategy A decreases the median waiting time for answers and strategy B increases the acceptance rate of answers. In investigating the strategic persistence of users, we find that users tends to stick on the same strategy over time in a community, but switch from one strategy to the other across communities. This finding reveals the different sets of knowledge and skills between users. A balance between the population of users taking A and B strategies that approximates 2:1, is found to be optimal to the sustainable growth of communities.

ContributorsWu, Lingfei (Author) / Baggio, Jacopo (Author) / Janssen, Marco (Author) / ASU-SFI Center for Biosocial Complex Systems (Contributor)
Created2016-03-02
128923-Thumbnail Image.png
Description

The unicellular microalga Haematococcus pluvialis has emerged as a promising biomass feedstock for the ketocarotenoid astaxanthin and neutral lipid triacylglycerol. Motile flagellates, resting palmella cells, and cysts are the major life cycle stages of H. pluvialis. Fast-growing motile cells are usually used to induce astaxanthin and triacylglycerol biosynthesis under stress

The unicellular microalga Haematococcus pluvialis has emerged as a promising biomass feedstock for the ketocarotenoid astaxanthin and neutral lipid triacylglycerol. Motile flagellates, resting palmella cells, and cysts are the major life cycle stages of H. pluvialis. Fast-growing motile cells are usually used to induce astaxanthin and triacylglycerol biosynthesis under stress conditions (high light or nutrient starvation); however, productivity of biomass and bioproducts are compromised due to the susceptibility of motile cells to stress. This study revealed that the Photosystem II (PSII) reaction center D1 protein, the manganese-stabilizing protein PsbO, and several major membrane glycerolipids (particularly for chloroplast membrane lipids monogalactosyldiacylglycerol and phosphatidylglycerol), decreased dramatically in motile cells under high light (HL). In contrast, palmella cells, which are transformed from motile cells after an extended period of time under favorable growth conditions, have developed multiple protective mechanisms - including reduction in chloroplast membrane lipids content, downplay of linear photosynthetic electron transport, and activating nonphotochemical quenching mechanisms - while accumulating triacylglycerol. Consequently, the membrane lipids and PSII proteins (D1 and PsbO) remained relatively stable in palmella cells subjected to HL. Introducing palmella instead of motile cells to stress conditions may greatly increase astaxanthin and lipid production in H. pluvialis culture.

ContributorsWang, Baobei (Author) / Zhang, Zhen (Author) / Hu, Qiang (Author) / Sommerfeld, Milton (Author) / Lu, Yinghua (Author) / Han, Danxiang (Author) / College of Liberal Arts and Sciences (Contributor)
Created2014-09-15
129065-Thumbnail Image.png
Description

Background: Lizards are evolutionarily the most closely related vertebrates to humans that can lose and regrow an entire appendage. Regeneration in lizards involves differential expression of hundreds of genes that regulate wound healing, musculoskeletal development, hormonal response, and embryonic morphogenesis. While microRNAs are able to regulate large groups of genes, their

Background: Lizards are evolutionarily the most closely related vertebrates to humans that can lose and regrow an entire appendage. Regeneration in lizards involves differential expression of hundreds of genes that regulate wound healing, musculoskeletal development, hormonal response, and embryonic morphogenesis. While microRNAs are able to regulate large groups of genes, their role in lizard regeneration has not been investigated.

Results: MicroRNA sequencing of green anole lizard (Anolis carolinensis) regenerating tail and associated tissues revealed 350 putative novel and 196 known microRNA precursors. Eleven microRNAs were differentially expressed between the regenerating tail tip and base during maximum outgrowth (25 days post autotomy), including miR-133a, miR-133b, and miR-206, which have been reported to regulate regeneration and stem cell proliferation in other model systems. Three putative novel differentially expressed microRNAs were identified in the regenerating tail tip.

Conclusions: Differentially expressed microRNAs were identified in the regenerating lizard tail, including known regulators of stem cell proliferation. The identification of 3 putative novel microRNAs suggests that regulatory networks, either conserved in vertebrates and previously uncharacterized or specific to lizards, are involved in regeneration. These findings suggest that differential regulation of microRNAs may play a role in coordinating the timing and expression of hundreds of genes involved in regeneration.

ContributorsHutchins, Elizabeth (Author) / Eckalbar, Walter (Author) / Wolter, Justin (Author) / Mangone, Marco (Author) / Kusumi, Kenro (Author) / College of Liberal Arts and Sciences (Contributor)
Created2016-05-05
129076-Thumbnail Image.png
Description

Background: Tissue-specific RNA plasticity broadly impacts the development, tissue identity and adaptability of all organisms, but changes in composition, expression levels and its impact on gene regulation in different somatic tissues are largely unknown. Here we developed a new method, polyA-tagging and sequencing (PAT-Seq) to isolate high-quality tissue-specific mRNA from Caenorhabditis

Background: Tissue-specific RNA plasticity broadly impacts the development, tissue identity and adaptability of all organisms, but changes in composition, expression levels and its impact on gene regulation in different somatic tissues are largely unknown. Here we developed a new method, polyA-tagging and sequencing (PAT-Seq) to isolate high-quality tissue-specific mRNA from Caenorhabditis elegans intestine, pharynx and body muscle tissues and study changes in their tissue-specific transcriptomes and 3’UTRomes.

Results: We have identified thousands of novel genes and isoforms differentially expressed between these three tissues. The intestine transcriptome is expansive, expressing over 30% of C. elegans mRNAs, while muscle transcriptomes are smaller but contain characteristic unique gene signatures. Active promoter regions in all three tissues reveal both known and novel enriched tissue-specific elements, along with putative transcription factors, suggesting novel tissue-specific modes of transcription initiation. We have precisely mapped approximately 20,000 tissue-specific polyadenylation sites and discovered that about 30% of transcripts in somatic cells use alternative polyadenylation in a tissue-specific manner, with their 3’UTR isoforms significantly enriched with microRNA targets.

Conclusions: For the first time, PAT-Seq allowed us to directly study tissue specific gene expression changes in an in vivo setting and compare these changes between three somatic tissues from the same organism at single-base resolution within the same experiment. We pinpoint precise tissue-specific transcriptome rearrangements and for the first time link tissue-specific alternative polyadenylation to miRNA regulation, suggesting novel and unexplored tissue-specific post-transcriptional regulatory networks in somatic cells.

ContributorsBlazie, Stephen (Author) / Babb, Cody (Author) / Wilky, Henry (Author) / Rawls, Alan (Author) / Park, Jin (Author) / Mangone, Marco (Author) / College of Liberal Arts and Sciences (Contributor)
Created2015-01-20
129101-Thumbnail Image.png
Description

Background: 3′untranslated regions (3′UTRs) are poorly understood portions of eukaryotic mRNAs essential for post-transcriptional gene regulation. Sequence elements in 3′UTRs can be target sites for regulatory molecules such as RNA binding proteins and microRNAs (miRNAs), and these interactions can exert significant control on gene networks. However, many such interactions remain uncharacterized

Background: 3′untranslated regions (3′UTRs) are poorly understood portions of eukaryotic mRNAs essential for post-transcriptional gene regulation. Sequence elements in 3′UTRs can be target sites for regulatory molecules such as RNA binding proteins and microRNAs (miRNAs), and these interactions can exert significant control on gene networks. However, many such interactions remain uncharacterized due to a lack of high-throughput (HT) tools to study 3′UTR biology. HT cloning efforts such as the human ORFeome exemplify the potential benefits of genomic repositories for studying human disease, especially in relation to the discovery of biomarkers and targets for therapeutic agents. Currently there are no publicly available human 3′UTR libraries. To address this we have prepared the first version of the human 3′UTRome (h3′UTRome v1) library. The h3′UTRome is produced to a single high quality standard using the same recombinational cloning technology used for the human ORFeome, enabling universal operating methods and high throughput experimentation. The library is thoroughly sequenced and annotated with simple online access to information, and made publicly available through gene repositories at low cost to all scientists with minimal restriction.

Results: The first release of the h3′UTRome library comprises 1,461 human 3′UTRs cloned into Gateway® entry vectors, ready for downstream analyses. It contains 3′UTRs for 985 transcription factors, 156 kinases, 171 RNA binding proteins, and 186 other genes involved in gene regulation and in disease. We demonstrate the feasibility of the h3′UTRome library by screening a panel of 87 3′UTRs for targeting by two miRNAs: let-7c, which is implicated in tumorigenesis, and miR-221, which is implicated in atherosclerosis and heart disease. The panel is enriched with genes involved in the RAS signaling pathway, putative novel targets for the two miRNAs, as well as genes implicated in tumorigenesis and heart disease.

Conclusions: The h3′UTRome v1 library is a modular resource that can be utilized for high-throughput screens to identify regulatory interactions between trans-acting factors and 3′UTRs, Importantly, the library can be customized based on the specifications of the researcher, allowing the systematic study of human 3′UTR biology.

ContributorsKotagama, Kasuen (Author) / Babb, Cody (Author) / Wolter, Justin (Author) / Murphy, Ronan P. (Author) / Mangone, Marco (Author) / College of Liberal Arts and Sciences (Contributor)
Created2015-12-09
Description

On-going efforts to understand the dynamics of coupled social-ecological (or more broadly, coupled infrastructure) systems and common pool resources have led to the generation of numerous datasets based on a large number of case studies. This data has facilitated the identification of important factors and fundamental principles which increase our

On-going efforts to understand the dynamics of coupled social-ecological (or more broadly, coupled infrastructure) systems and common pool resources have led to the generation of numerous datasets based on a large number of case studies. This data has facilitated the identification of important factors and fundamental principles which increase our understanding of such complex systems. However, the data at our disposal are often not easily comparable, have limited scope and scale, and are based on disparate underlying frameworks inhibiting synthesis, meta-analysis, and the validation of findings. Research efforts are further hampered when case inclusion criteria, variable definitions, coding schema, and inter-coder reliability testing are not made explicit in the presentation of research and shared among the research community. This paper first outlines challenges experienced by researchers engaged in a large-scale coding project; then highlights valuable lessons learned; and finally discusses opportunities for further research on comparative case study analysis focusing on social-ecological systems and common pool resources. Includes supplemental materials and appendices published in the International Journal of the Commons 2016 Special Issue. Volume 10 - Issue 2 - 2016.

ContributorsRatajczyk, Elicia (Author) / Brady, Ute (Author) / Baggio, Jacopo (Author) / Barnett, Allain J. (Author) / Perez Ibarra, Irene (Author) / Rollins, Nathan (Author) / Rubinos, Cathy (Author) / Shin, Hoon Cheol (Author) / Yu, David (Author) / Aggarwal, Rimjhim (Author) / Anderies, John (Author) / Janssen, Marco (Author) / ASU-SFI Center for Biosocial Complex Systems (Contributor)
Created2016-09-09