Search Content

Linnorm: Improved Statistical Analysis for Single Cell RNA-seq Expression Data

Description

Linnorm is a novel normalization and transformation method for the analysis of single cell RNA sequencing (scRNA-seq) data. Linnorm is developed to remove technical noises and simultaneously preserve biological variations in scRNA-seq data, such that existing statistical methods can be improved. Using real scRNA-seq data, we compared Linnorm with existing…

Linnorm is a novel normalization and transformation method for the analysis of single cell RNA sequencing (scRNA-seq) data. Linnorm is developed to remove technical noises and simultaneously preserve biological variations in scRNA-seq data, such that existing statistical methods can be improved. Using real scRNA-seq data, we compared Linnorm with existing normalization methods, including NODES, SAMstrt, SCnorm, scran, DESeq and TMM. Linnorm shows advantages in speed, technical noise removal and preservation of cell heterogeneity, which can improve existing methods in the discovery of novel subtypes, pseudo-temporal ordering of cells, clustering analysis, etc. Linnorm also performs better than existing DEG analysis methods, including BASiCS, NODES, SAMstrt, Seurat and DESeq2, in false positive rate control and accuracy.

ContributorsYip, Shun H. (Author) / Wang, Panwen (Author) / Kocher, Jean-Pierre A. (Author) / Sham, Pak Chung (Author) / Wang, Junwen (Author) / College of Health Solutions (Contributor)

Created2017-09-18

Activation of E-Prostanoid 3 Receptor in Macrophages Facilitates Cardiac Healing After Myocardial Infarction

Description

Two distinct monocyte (Mo)/macrophage (Mp) subsets (Ly6C^low and Ly6C^hi) orchestrate cardiac recovery process following myocardial infarction (MI). Prostaglandin (PG) E₂ is involved in the Mo/Mp-mediated inflammatory response, however, the role of its receptors in Mos/Mps in cardiac healing remains to be determined. Here we show that pharmacological inhibition or gene…

Two distinct monocyte (Mo)/macrophage (Mp) subsets (Ly6C^low and Ly6C^hi) orchestrate cardiac recovery process following myocardial infarction (MI). Prostaglandin (PG) E₂ is involved in the Mo/Mp-mediated inflammatory response, however, the role of its receptors in Mos/Mps in cardiac healing remains to be determined. Here we show that pharmacological inhibition or gene ablation of the Ep3 receptor in mice suppresses accumulation of Ly6C^low Mos/Mps in infarcted hearts. Ep3 deletion in Mos/Mps markedly attenuates healing after MI by reducing neovascularization in peri-infarct zones. Ep3 deficiency diminishes CX3C chemokine receptor 1 (CX3CR1) expression and vascular endothelial growth factor (VEGF) secretion in Mos/Mps by suppressing TGFβ1 signaling and subsequently inhibits Ly6C^low Mos/Mps migration and angiogenesis. Targeted overexpression of Ep3 receptors in Mos/Mps improves wound healing by enhancing angiogenesis. Thus, the PGE₂/Ep3 axis promotes cardiac healing after MI by activating reparative Ly6C^low Mos/Mps, indicating that Ep3 receptor activation may be a promising therapeutic target for acute MI.

ContributorsTang, Juan (Author) / Shen, Yujun (Author) / Chen, Guilin (Author) / Wan, Qiangyou (Author) / Wang, Kai (Author) / Zhang, Jian (Author) / Qin, Jing (Author) / Liu, Guizhu (Author) / Zuo, Shengkai (Author) / Tao, Bo (Author) / Yu, Yu (Author) / Wang, Junwen (Author) / Lazarus, Michael (Author) / Yu, Ying (Author) / College of Health Solutions (Contributor)

Created2017-03-03

An Integrative Method to Decode Regulatory Logics in Gene Transcription

Description

Modeling of transcriptional regulatory networks (TRNs) has been increasingly used to dissect the nature of gene regulation. Inference of regulatory relationships among transcription factors (TFs) and genes, especially among multiple TFs, is still challenging. In this study, we introduced an integrative method, LogicTRN, to decode TF–TF interactions that form TF…

Modeling of transcriptional regulatory networks (TRNs) has been increasingly used to dissect the nature of gene regulation. Inference of regulatory relationships among transcription factors (TFs) and genes, especially among multiple TFs, is still challenging. In this study, we introduced an integrative method, LogicTRN, to decode TF–TF interactions that form TF logics in regulating target genes. By combining cis-regulatory logics and transcriptional kinetics into one single model framework, LogicTRN can naturally integrate dynamic gene expression data and TF-DNA-binding signals in order to identify the TF logics and to reconstruct the underlying TRNs. We evaluated the newly developed methodology using simulation, comparison and application studies, and the results not only show their consistence with existing knowledge, but also demonstrate its ability to accurately reconstruct TRNs in biological complex systems.

ContributorsYan, Bin (Author) / Guan, Daogang (Author) / Wang, Chao (Author) / Wang, Junwen (Author) / He, Bing (Author) / Qin, Jing (Author) / Boheler, Kenneth R. (Author) / Lu, Aiping (Author) / Zhang, Ge (Author) / Zhu, Hailong (Author) / College of Health Solutions (Contributor)

Created2017-10-19

Next-Generation Sequencing Methylation Profiling of Subjects With Obesity Identifies Novel Gene Changes

Description

Background: Obesity is a metabolic disease caused by environmental and genetic factors. However, the epigenetic mechanisms of obesity are incompletely understood. The aim of our study was to investigate the role of skeletal muscle DNA methylation in combination with transcriptomic changes in obesity.

Results: Muscle biopsies were obtained basally from lean (n = 12; BMI = 23.4 ± 0.7…

Background: Obesity is a metabolic disease caused by environmental and genetic factors. However, the epigenetic mechanisms of obesity are incompletely understood. The aim of our study was to investigate the role of skeletal muscle DNA methylation in combination with transcriptomic changes in obesity.

Results: Muscle biopsies were obtained basally from lean (n = 12; BMI = 23.4 ± 0.7 kg/m[superscript 2]) and obese (n = 10; BMI = 32.9 ± 0.7 kg/m[superscript 2]) participants in combination with euglycemic-hyperinsulinemic clamps to assess insulin sensitivity. We performed reduced representation bisulfite sequencing (RRBS) next-generation methylation and microarray analyses on DNA and RNA isolated from vastus lateralis muscle biopsies. There were 13,130 differentially methylated cytosines (DMC; uncorrected P < 0.05) that were altered in the promoter and untranslated (5' and 3'UTR) regions in the obese versus lean analysis. Microarray analysis revealed 99 probes that were significantly (corrected P < 0.05) altered. Of these, 12 genes (encompassing 22 methylation sites) demonstrated a negative relationship between gene expression and DNA methylation. Specifically, sorbin and SH3 domain containing 3 (SORBS3) which codes for the adapter protein vinexin was significantly decreased in gene expression (fold change −1.9) and had nine DMCs that were significantly increased in methylation in obesity (methylation differences ranged from 5.0 to 24.4 %). Moreover, differentially methylated region (DMR) analysis identified a region in the 5'UTR (Chr.8:22,423,530–22,423,569) of SORBS3 that was increased in methylation by 11.2 % in the obese group. The negative relationship observed between DNA methylation and gene expression for SORBS3 was validated by a site-specific sequencing approach, pyrosequencing, and qRT-PCR. Additionally, we performed transcription factor binding analysis and identified a number of transcription factors whose binding to the differentially methylated sites or region may contribute to obesity.

Conclusions: These results demonstrate that obesity alters the epigenome through DNA methylation and highlights novel transcriptomic changes in SORBS3 in skeletal muscle.

ContributorsDay, Samantha (Author) / Coletta, Rich (Author) / Kim, Joon Young (Author) / Campbell, Latoya (Author) / Benjamin, Tonya R. (Author) / Roust, Lori R. (Author) / De Filippis, Elena A. (Author) / Dinu, Valentin (Author) / Shaibi, Gabriel (Author) / Mandarino, Lawrence J. (Author) / Coletta, Dawn (Author) / College of Liberal Arts and Sciences (Contributor)

Created2016-07-18

Evaluating β Diversity as a Surrogate for Species Representation at Fine Scale

Description

Species turnover or β diversity is a conceptually attractive surrogate for conservation planning. However, there has been only 1 attempt to determine how well sites selected to maximize β diversity represent species, and that test was done at a scale too coarse (2,500 km² sites) to inform most conservation decisions.…

Species turnover or β diversity is a conceptually attractive surrogate for conservation planning. However, there has been only 1 attempt to determine how well sites selected to maximize β diversity represent species, and that test was done at a scale too coarse (2,500 km² sites) to inform most conservation decisions. We used 8 plant datasets, 3 bird datasets, and 1 mammal dataset to evaluate whether sites selected to span β diversity will efficiently represent species at finer scale (sites sizes < 1 ha to 625 km²). We used ordinations to characterize dissimilarity in species assemblages (β diversity) among plots (inventory data) or among grid cells (atlas data). We then selected sites to maximize β diversity and used the Species Accumulation Index, SAI, to evaluate how efficiently the surrogate (selecting sites for maximum β diversity) represented species in the same taxon. Across all 12 datasets, sites selected for maximum β diversity represented species with a median efficiency of 24% (i.e., the surrogate was 24% more effective than random selection of sites), and an interquartile range of 4% to 41% efficiency. β diversity was a better surrogate for bird datasets than for plant datasets, and for atlas datasets with 10-km to 14-km grid cells than for atlas datasets with 25-km grid cells. We conclude that β diversity is more than a mere descriptor of how species are distributed on the landscape; in particular β diversity might be useful to maximize the complementarity of a set of sites. Because we tested only within-taxon surrogacy, our results do not prove that β diversity is useful for conservation planning. But our results do justify further investigation to identify the circumstances in which β diversity performs well, and to evaluate it as a cross-taxon surrogate.

ContributorsBeier, Paul (Author) / Albuquerque, Fabio Suzart de (Author) / College of Integrative Sciences and Arts (Contributor)

Created2016-03-04

BitTorious Volunteer: Server-Side Extensions for Centrally-Managed Volunteer Storage in BitTorrent Swarms

Description

Background: Our publication of the BitTorious portal [1] demonstrated the ability to create a privatized distributed data warehouse of sufficient magnitude for real-world bioinformatics studies using minimal changes to the standard BitTorrent tracker protocol. In this second phase, we release a new server-side specification to accept anonymous philantropic storage donations by…

Background: Our publication of the BitTorious portal [1] demonstrated the ability to create a privatized distributed data warehouse of sufficient magnitude for real-world bioinformatics studies using minimal changes to the standard BitTorrent tracker protocol. In this second phase, we release a new server-side specification to accept anonymous philantropic storage donations by the general public, wherein a small portion of each user’s local disk may be used for archival of scientific data. We have implementated the server-side announcement and control portions of this BitTorrent extension into v3.0.0 of the BitTorious portal, upon which compatible clients may be built.

Results: Automated test cases for the BitTorious Volunteer extensions have been added to the portal’s v3.0.0 release, supporting validation of the “peer affinity” concept and announcement protocol introduced by this specification. Additionally, a separate reference implementation of affinity calculation has been provided in C++ for informaticians wishing to integrate into libtorrent-based projects.

Conclusions: The BitTorrent “affinity” extensions as provided in the BitTorious portal reference implementation allow data publishers to crowdsource the extreme storage prerequisites for research in “big data” fields. With sufficient awareness and adoption of BitTorious Volunteer-based clients by the general public, the BitTorious portal may be able to provide peta-scale storage resources to the scientific community at relatively insignificant financial cost.

ContributorsLee, Preston (Author) / Dinu, Valentin (Author) / College of Health Solutions (Contributor)

Created2015-11-04

BitTorious: Global Controlled Genomics Data Publication, Research, and Archiving Via BitTorrent Extensions

Description

Background: Centralized silos of genomic data are architecturally easier to initially design, develop and deploy than distributed models. However, as interoperability pains in EHR/EMR, HIE and other collaboration-centric life sciences domains have taught us, the core challenge of networking genomics systems is not in the construction of individual silos, but the…

Background: Centralized silos of genomic data are architecturally easier to initially design, develop and deploy than distributed models. However, as interoperability pains in EHR/EMR, HIE and other collaboration-centric life sciences domains have taught us, the core challenge of networking genomics systems is not in the construction of individual silos, but the interoperability of those deployments in a manner embracing the heterogeneous needs, terms and infrastructure of collaborating parties. This article demonstrates the adaptation of BitTorrent to private collaboration networks in an authenticated, authorized and encrypted manner while retaining the same characteristics of standard BitTorrent.

Results: The BitTorious portal was sucessfully used to manage many concurrent domestic Bittorrent clients across the United States: exchanging genomics data payloads in excess of 500GiB using the uTorrent client software on Linux, OSX and Windows platforms. Individual nodes were sporadically interrupted to verify the resilience of the system to outages of a single client node as well as recovery of nodes resuming operation on intermittent Internet connections.

Conclusions: The authorization-based extension of Bittorrent and accompanying BitTorious reference tracker and user management web portal provide a free, standards-based, general purpose and extensible data distribution system for large ‘omics collaborations.

ContributorsLee, Preston (Author) / Dinu, Valentin (Author) / College of Health Solutions (Contributor)

Created2014-12-21

Cepip: Context-Dependent Epigenomic Weighting for Prioritization of Regulatory Variants and Disease-Associated Genes

Description

It remains challenging to predict regulatory variants in particular tissues or cell types due to highly context-specific gene regulation. By connecting large-scale epigenomic profiles to expression quantitative trait loci (eQTLs) in a wide range of human tissues/cell types, we identify critical chromatin features that predict variant regulatory potential. We present…

It remains challenging to predict regulatory variants in particular tissues or cell types due to highly context-specific gene regulation. By connecting large-scale epigenomic profiles to expression quantitative trait loci (eQTLs) in a wide range of human tissues/cell types, we identify critical chromatin features that predict variant regulatory potential. We present cepip, a joint likelihood framework, for estimating a variant’s regulatory probability in a context-dependent manner. Our method exhibits significant GWAS signal enrichment and is superior to existing cell type-specific methods. Furthermore, using phenotypically relevant epigenomes to weight the GWAS single-nucleotide polymorphisms, we improve the statistical power of the gene-based association test.

ContributorsLi, Mulin Jun (Author) / Li, Miaoxin (Author) / Liu, Zipeng (Author) / Yan, Bin (Author) / Pan, Zhicheng (Author) / Huang, Dandan (Author) / Liang, Qian (Author) / Ying, Dingge (Author) / Xu, Feng (Author) / Yao, Hongcheng (Author) / Wang, Panwen (Author) / Kocher, Jean-Pierre A. (Author) / Xia, Zhengyuan (Author) / Sham, Pak Chung (Author) / Liu, Jun S. (Author) / Wang, Junwen (Author) / College of Health Solutions (Contributor)

Created2017-03-16

The Geography of Hotspots of Rarity-Weighted Richness of Birds and Their Coverage by Natura 2000

Description

A major challenge for biogeographers and conservation planners is to identify where to best locate or distribute high-priority areas for conservation and to explore whether these areas are well represented by conservation actions such as protected areas (PAs). We aimed to identify high-priority areas for conservation, expressed as hotpots of…

A major challenge for biogeographers and conservation planners is to identify where to best locate or distribute high-priority areas for conservation and to explore whether these areas are well represented by conservation actions such as protected areas (PAs). We aimed to identify high-priority areas for conservation, expressed as hotpots of rarity-weighted richness (HRR)–sites that efficiently represent species–for birds across EU countries, and to explore whether HRR are well represented by the Natura 2000 network. Natura 2000 is an evolving network of PAs that seeks to conserve biodiversity through the persistence of the most patrimonial species and habitats across Europe. This network includes Sites of Community Importance (SCI) and Special Areas of Conservation (SAC), where the latter regulated the designation of Special Protected Areas (SPA). Distribution maps for 416 bird species and complementarity-based approaches were used to map geographical patterns of rarity-weighted richness (RWR) and HRR for birds. We used species accumulation index to evaluate whether RWR was efficient surrogates to identify HRRs for birds. The results of our analysis support the proposition that prioritizing sites in order of RWR is a reliable way to identify sites that efficiently represent birds. HRRs were concentrated in the Mediterranean Basin and alpine and boreal biogeographical regions of northern Europe. The cells with high RWR values did not correspond to cells where Natura 2000 was present. We suggest that patterns of RWR could become a focus for conservation biogeography. Our analysis demonstrates that identifying HRR is a robust approach for prioritizing management actions, and reveals the need for more conservation actions, especially on HRR.

ContributorsAlbuquerque, Fabio Suzart de (Author) / Gregory, Andrew (Author) / College of Integrative Sciences and Arts (Contributor)

Created2017-04-05

Statistical Methods for Analyzing Immunosignatures

Description

Background: Immunosignaturing is a new peptide microarray based technology for profiling of humoral immune responses. Despite new challenges, immunosignaturing gives us the opportunity to explore new and fundamentally different research questions. In addition to classifying samples based on disease status, the complex patterns and latent factors underlying immunosignatures, which we attempt…

Background: Immunosignaturing is a new peptide microarray based technology for profiling of humoral immune responses. Despite new challenges, immunosignaturing gives us the opportunity to explore new and fundamentally different research questions. In addition to classifying samples based on disease status, the complex patterns and latent factors underlying immunosignatures, which we attempt to model, may have a diverse range of applications.

Methods: We investigate the utility of a number of statistical methods to determine model performance and address challenges inherent in analyzing immunosignatures. Some of these methods include exploratory and confirmatory factor analyses, classical significance testing, structural equation and mixture modeling.

Results: We demonstrate an ability to classify samples based on disease status and show that immunosignaturing is a very promising technology for screening and presymptomatic screening of disease. In addition, we are able to model complex patterns and latent factors underlying immunosignatures. These latent factors may serve as biomarkers for disease and may play a key role in a bioinformatic method for antibody discovery.

Conclusion: Based on this research, we lay out an analytic framework illustrating how immunosignatures may be useful as a general method for screening and presymptomatic screening of disease as well as antibody discovery.

ContributorsBrown, Justin (Author) / Stafford, Phillip (Author) / Johnston, Stephen (Author) / Dinu, Valentin (Author) / College of Health Solutions (Contributor)

Created2011-08-19

ASU Scholarship Showcase

Filtering by

Linnorm: Improved Statistical Analysis for Single Cell RNA-seq Expression Data

Activation of E-Prostanoid 3 Receptor in Macrophages Facilitates Cardiac Healing After Myocardial Infarction

An Integrative Method to Decode Regulatory Logics in Gene Transcription

Next-Generation Sequencing Methylation Profiling of Subjects With Obesity Identifies Novel Gene Changes

Evaluating β Diversity as a Surrogate for Species Representation at Fine Scale

BitTorious Volunteer: Server-Side Extensions for Centrally-Managed Volunteer Storage in BitTorrent Swarms

BitTorious: Global Controlled Genomics Data Publication, Research, and Archiving Via BitTorrent Extensions

Cepip: Context-Dependent Epigenomic Weighting for Prioritization of Regulatory Variants and Disease-Associated Genes

The Geography of Hotspots of Rarity-Weighted Richness of Birds and Their Coverage by Natura 2000

Statistical Methods for Analyzing Immunosignatures