Search Content

Linnorm: Improved Statistical Analysis for Single Cell RNA-seq Expression Data

Description

Linnorm is a novel normalization and transformation method for the analysis of single cell RNA sequencing (scRNA-seq) data. Linnorm is developed to remove technical noises and simultaneously preserve biological variations in scRNA-seq data, such that existing statistical methods can be improved. Using real scRNA-seq data, we compared Linnorm with existing…

Linnorm is a novel normalization and transformation method for the analysis of single cell RNA sequencing (scRNA-seq) data. Linnorm is developed to remove technical noises and simultaneously preserve biological variations in scRNA-seq data, such that existing statistical methods can be improved. Using real scRNA-seq data, we compared Linnorm with existing normalization methods, including NODES, SAMstrt, SCnorm, scran, DESeq and TMM. Linnorm shows advantages in speed, technical noise removal and preservation of cell heterogeneity, which can improve existing methods in the discovery of novel subtypes, pseudo-temporal ordering of cells, clustering analysis, etc. Linnorm also performs better than existing DEG analysis methods, including BASiCS, NODES, SAMstrt, Seurat and DESeq2, in false positive rate control and accuracy.

ContributorsYip, Shun H. (Author) / Wang, Panwen (Author) / Kocher, Jean-Pierre A. (Author) / Sham, Pak Chung (Author) / Wang, Junwen (Author) / College of Health Solutions (Contributor)

Created2017-09-18

Scaling Behaviours in the Growth of Networked Systems and Their Geometric Origins

Description

Two classes of scaling behaviours, namely the super-linear scaling of links or activities, and the sub-linear scaling of area, diversity, or time elapsed with respect to size have been found to prevail in the growth of complex networked systems. Despite some pioneering modelling approaches proposed for specific systems, whether there…

Two classes of scaling behaviours, namely the super-linear scaling of links or activities, and the sub-linear scaling of area, diversity, or time elapsed with respect to size have been found to prevail in the growth of complex networked systems. Despite some pioneering modelling approaches proposed for specific systems, whether there exists some general mechanisms that account for the origins of such scaling behaviours in different contexts, especially in socioeconomic systems, remains an open question. We address this problem by introducing a geometric network model without free parameter, finding that both super-linear and sub-linear scaling behaviours can be simultaneously reproduced and that the scaling exponents are exclusively determined by the dimension of the Euclidean space in which the network is embedded. We implement some realistic extensions to the basic model to offer more accurate predictions for cities of various scaling behaviours and the Zipf distribution reported in the literature and observed in our empirical studies. All of the empirical results can be precisely recovered by our model with analytical predictions of all major properties. By virtue of these general findings concerning scaling behaviour, our models with simple mechanisms gain new insights into the evolution and development of complex networked systems.

ContributorsZhang, Jiang (Author) / Li, Xintong (Author) / Wang, Xinran (Author) / Wang, Wen-Xu (Author) / Wu, Lingfei (Author) / College of Liberal Arts and Sciences (Contributor)

Created2015-04-29

Whole Genome Sequencing of Field Isolates Reveals Extensive Genetic Diversity in Plasmodium Vivax From Colombia

Description

Plasmodium vivax is the most prevalent malarial species in South America and exerts a substantial burden on the populations it affects. The control and eventual elimination of P. vivax are global health priorities. Genomic research contributes to this objective by improving our understanding of the biology of P. vivax and…

Plasmodium vivax is the most prevalent malarial species in South America and exerts a substantial burden on the populations it affects. The control and eventual elimination of P. vivax are global health priorities. Genomic research contributes to this objective by improving our understanding of the biology of P. vivax and through the development of new genetic markers that can be used to monitor efforts to reduce malaria transmission. Here we analyze whole-genome data from eight field samples from a region in Cordóba, Colombia where malaria is endemic. We find considerable genetic diversity within this population, a result that contrasts with earlier studies suggesting that P. vivax had limited diversity in the Americas. We also identify a selective sweep around a substitution known to confer resistance to sulphadoxine-pyrimethamine (SP). This is the first observation of a selective sweep for SP resistance in this species. These results indicate that P. vivax has been exposed to SP pressure even when the drug is not in use as a first line treatment for patients afflicted by this parasite. We identify multiple non-synonymous substitutions in three other genes known to be involved with drug resistance in Plasmodium species. Finally, we found extensive microsatellite polymorphisms. Using this information we developed 18 polymorphic and easy to score microsatellite loci that can be used in epidemiological investigations in South America.

ContributorsWinter, David (Author) / Pacheco, Maria Andreina (Author) / Vallejo, Andres F. (Author) / Schwartz, Rachel (Author) / Arevalo-Herrera, Myriam (Author) / Herrera, Socrates (Author) / Cartwright, Reed (Author) / Escalante, Ananias (Author) / Biodesign Institute (Contributor)

Created2015-12-28

Activation of E-Prostanoid 3 Receptor in Macrophages Facilitates Cardiac Healing After Myocardial Infarction

Description

Two distinct monocyte (Mo)/macrophage (Mp) subsets (Ly6C^low and Ly6C^hi) orchestrate cardiac recovery process following myocardial infarction (MI). Prostaglandin (PG) E₂ is involved in the Mo/Mp-mediated inflammatory response, however, the role of its receptors in Mos/Mps in cardiac healing remains to be determined. Here we show that pharmacological inhibition or gene…

Two distinct monocyte (Mo)/macrophage (Mp) subsets (Ly6C^low and Ly6C^hi) orchestrate cardiac recovery process following myocardial infarction (MI). Prostaglandin (PG) E₂ is involved in the Mo/Mp-mediated inflammatory response, however, the role of its receptors in Mos/Mps in cardiac healing remains to be determined. Here we show that pharmacological inhibition or gene ablation of the Ep3 receptor in mice suppresses accumulation of Ly6C^low Mos/Mps in infarcted hearts. Ep3 deletion in Mos/Mps markedly attenuates healing after MI by reducing neovascularization in peri-infarct zones. Ep3 deficiency diminishes CX3C chemokine receptor 1 (CX3CR1) expression and vascular endothelial growth factor (VEGF) secretion in Mos/Mps by suppressing TGFβ1 signaling and subsequently inhibits Ly6C^low Mos/Mps migration and angiogenesis. Targeted overexpression of Ep3 receptors in Mos/Mps improves wound healing by enhancing angiogenesis. Thus, the PGE₂/Ep3 axis promotes cardiac healing after MI by activating reparative Ly6C^low Mos/Mps, indicating that Ep3 receptor activation may be a promising therapeutic target for acute MI.

ContributorsTang, Juan (Author) / Shen, Yujun (Author) / Chen, Guilin (Author) / Wan, Qiangyou (Author) / Wang, Kai (Author) / Zhang, Jian (Author) / Qin, Jing (Author) / Liu, Guizhu (Author) / Zuo, Shengkai (Author) / Tao, Bo (Author) / Yu, Yu (Author) / Wang, Junwen (Author) / Lazarus, Michael (Author) / Yu, Ying (Author) / College of Health Solutions (Contributor)

Created2017-03-03

The Impact of Self-Incompatibility Systems on the Prevention of Biparental Inbreeding

Description

Inbreeding in hermaphroditic plants can occur through two different mechanisms: biparental inbreeding, when a plant mates with a related individual, or self-fertilization, when a plant mates with itself. To avoid inbreeding, many hermaphroditic plants have evolved self-incompatibility (SI) systems which prevent or limit self-fertilization. One particular SI system—homomorphic SI—can also…

Inbreeding in hermaphroditic plants can occur through two different mechanisms: biparental inbreeding, when a plant mates with a related individual, or self-fertilization, when a plant mates with itself. To avoid inbreeding, many hermaphroditic plants have evolved self-incompatibility (SI) systems which prevent or limit self-fertilization. One particular SI system—homomorphic SI—can also reduce biparental inbreeding. Homomorphic SI is found in many angiosperm species, and it is often assumed that the additional benefit of reduced biparental inbreeding may be a factor in the success of this SI system. To test this assumption, we developed a spatially-explicit, individual-based simulation of plant populations that displayed three different types of homomorphic SI. We measured the total level of inbreeding avoidance by comparing each population to a self-compatible population (NSI), and we measured biparental inbreeding avoidance by comparing to a population of self-incompatible plants that were free to mate with any other individual (PSI).

Because biparental inbreeding is more common when offspring dispersal is limited, we examined the levels of biparental inbreeding over a range of dispersal distances. We also tested whether the introduction of inbreeding depression affected the level of biparental inbreeding avoidance. We found that there was a statistically significant decrease in autozygosity in each of the homomorphic SI populations compared to the PSI population and, as expected, this was more pronounced when seed and pollen dispersal was limited. However, levels of homozygosity and inbreeding depression were not reduced. At low dispersal, homomorphic SI populations also suffered reduced female fecundity and had smaller census population sizes. Overall, our simulations showed that the homomorphic SI systems had little impact on the amount of biparental inbreeding in the population especially when compared to the overall reduction in inbreeding compared to the NSI population. With further study, this observation may have important consequences for research into the origin and evolution of homomorphic self-incompatibility systems.

ContributorsFurstenau, Tara (Author) / Cartwright, Reed (Author) / Biodesign Institute (Contributor)

Created2017-11-24

An Integrative Method to Decode Regulatory Logics in Gene Transcription

Description

Modeling of transcriptional regulatory networks (TRNs) has been increasingly used to dissect the nature of gene regulation. Inference of regulatory relationships among transcription factors (TFs) and genes, especially among multiple TFs, is still challenging. In this study, we introduced an integrative method, LogicTRN, to decode TF–TF interactions that form TF…

Modeling of transcriptional regulatory networks (TRNs) has been increasingly used to dissect the nature of gene regulation. Inference of regulatory relationships among transcription factors (TFs) and genes, especially among multiple TFs, is still challenging. In this study, we introduced an integrative method, LogicTRN, to decode TF–TF interactions that form TF logics in regulating target genes. By combining cis-regulatory logics and transcriptional kinetics into one single model framework, LogicTRN can naturally integrate dynamic gene expression data and TF-DNA-binding signals in order to identify the TF logics and to reconstruct the underlying TRNs. We evaluated the newly developed methodology using simulation, comparison and application studies, and the results not only show their consistence with existing knowledge, but also demonstrate its ability to accurately reconstruct TRNs in biological complex systems.

ContributorsYan, Bin (Author) / Guan, Daogang (Author) / Wang, Chao (Author) / Wang, Junwen (Author) / He, Bing (Author) / Qin, Jing (Author) / Boheler, Kenneth R. (Author) / Lu, Aiping (Author) / Zhang, Ge (Author) / Zhu, Hailong (Author) / College of Health Solutions (Contributor)

Created2017-10-19

The Importance of Selection in the Evolution of Blindness in Cavefish

Description

Background: Blindness has evolved repeatedly in cave-dwelling organisms, and many hypotheses have been proposed to explain this observation, including both accumulation of neutral loss-of-function mutations and adaptation to darkness. Investigating the loss of sight in cave dwellers presents an opportunity to understand the operation of fundamental evolutionary processes, including drift, selection,…

Background: Blindness has evolved repeatedly in cave-dwelling organisms, and many hypotheses have been proposed to explain this observation, including both accumulation of neutral loss-of-function mutations and adaptation to darkness. Investigating the loss of sight in cave dwellers presents an opportunity to understand the operation of fundamental evolutionary processes, including drift, selection, mutation, and migration.

Results: Here we model the evolution of blindness in caves. This model captures the interaction of three forces: (1) selection favoring alleles causing blindness, (2) immigration of sightedness alleles from a surface population, and (3) mutations creating blindness alleles. We investigated the dynamics of this model and determined selection-strength thresholds that result in blindness evolving in caves despite immigration of sightedness alleles from the surface. We estimate that the selection coefficient for blindness would need to be at least 0.005 (and maybe as high as 0.5) for blindness to evolve in the model cave-organism, Astyanax mexicanus.

Conclusions: Our results indicate that strong selection is required for the evolution of blindness in cave-dwelling organisms, which is consistent with recent work suggesting a high metabolic cost of eye development.

ContributorsCartwright, Reed (Author) / Schwartz, Rachel (Author) / Merry, Alexandra (Author) / Howell, Megan (Author) / Biodesign Institute (Contributor)

Created2017-02-07

Cepip: Context-Dependent Epigenomic Weighting for Prioritization of Regulatory Variants and Disease-Associated Genes

Description

It remains challenging to predict regulatory variants in particular tissues or cell types due to highly context-specific gene regulation. By connecting large-scale epigenomic profiles to expression quantitative trait loci (eQTLs) in a wide range of human tissues/cell types, we identify critical chromatin features that predict variant regulatory potential. We present…

It remains challenging to predict regulatory variants in particular tissues or cell types due to highly context-specific gene regulation. By connecting large-scale epigenomic profiles to expression quantitative trait loci (eQTLs) in a wide range of human tissues/cell types, we identify critical chromatin features that predict variant regulatory potential. We present cepip, a joint likelihood framework, for estimating a variant’s regulatory probability in a context-dependent manner. Our method exhibits significant GWAS signal enrichment and is superior to existing cell type-specific methods. Furthermore, using phenotypically relevant epigenomes to weight the GWAS single-nucleotide polymorphisms, we improve the statistical power of the gene-based association test.

ContributorsLi, Mulin Jun (Author) / Li, Miaoxin (Author) / Liu, Zipeng (Author) / Yan, Bin (Author) / Pan, Zhicheng (Author) / Huang, Dandan (Author) / Liang, Qian (Author) / Ying, Dingge (Author) / Xu, Feng (Author) / Yao, Hongcheng (Author) / Wang, Panwen (Author) / Kocher, Jean-Pierre A. (Author) / Xia, Zhengyuan (Author) / Sham, Pak Chung (Author) / Liu, Jun S. (Author) / Wang, Junwen (Author) / College of Health Solutions (Contributor)

Created2017-03-16

The Effect of the Dispersal Kernel on Isolation-By-Distance in a Continuous Population

Description

Under models of isolation-by-distance, population structure is determined by the probability of identity-by-descent between pairs of genes according to the geographic distance between them. Well established analytical results indicate that the relationship between geographical and genetic distance depends mostly on the neighborhood size of the population which represents a standardized…

Under models of isolation-by-distance, population structure is determined by the probability of identity-by-descent between pairs of genes according to the geographic distance between them. Well established analytical results indicate that the relationship between geographical and genetic distance depends mostly on the neighborhood size of the population which represents a standardized measure of gene flow. To test this prediction, we model local dispersal of haploid individuals on a two-dimensional landscape using seven dispersal kernels: Rayleigh, exponential, half-normal, triangular, gamma, Lomax and Pareto. When neighborhood size is held constant, the distributions produce similar patterns of isolation-by-distance, confirming predictions. Considering this, we propose that the triangular distribution is the appropriate null distribution for isolation-by-distance studies. Under the triangular distribution, dispersal is uniform over the neighborhood area which suggests that the common description of neighborhood size as a measure of an effective, local panmictic population is valid for popular families of dispersal distributions. We further show how to draw random variables from the triangular distribution efficiently and argue that it should be utilized in other studies in which computational efficiency is important.

ContributorsFurstenau, Tara (Author) / Cartwright, Reed (Author) / College of Liberal Arts and Sciences (Contributor)

Created2016-03-29

Data-Based Reconstruction of Complex Geospatial Networks, Nodal Positioning, and Detection of Hidden Nodes

Description

Given a complex geospatial network with nodes distributed in a two-dimensional region of physical space, can the locations of the nodes be determined and their connection patterns be uncovered based solely on data? We consider the realistic situation where time series/signals can be collected from a single location. A key…

Given a complex geospatial network with nodes distributed in a two-dimensional region of physical space, can the locations of the nodes be determined and their connection patterns be uncovered based solely on data? We consider the realistic situation where time series/signals can be collected from a single location. A key challenge is that the signals collected are necessarily time delayed, due to the varying physical distances from the nodes to the data collection centre. To meet this challenge, we develop a compressive-sensing-based approach enabling reconstruction of the full topology of the underlying geospatial network and more importantly, accurate estimate of the time delays. A standard triangularization algorithm can then be employed to find the physical locations of the nodes in the network. We further demonstrate successful detection of a hidden node (or a hidden source or threat), from which no signal can be obtained, through accurate detection of all its neighbouring nodes. As a geospatial network has the feature that a node tends to connect with geophysically nearby nodes, the localized region that contains the hidden node can be identified.

ContributorsSu, Riqi (Author) / Wang, Wen-Xu (Author) / Wang, Xiao (Author) / Lai, Ying-Cheng (Author) / Ira A. Fulton Schools of Engineering (Contributor)

Created2016-01-06

ASU Scholarship Showcase

Filtering by

Linnorm: Improved Statistical Analysis for Single Cell RNA-seq Expression Data

Scaling Behaviours in the Growth of Networked Systems and Their Geometric Origins

Whole Genome Sequencing of Field Isolates Reveals Extensive Genetic Diversity in Plasmodium Vivax From Colombia

Activation of E-Prostanoid 3 Receptor in Macrophages Facilitates Cardiac Healing After Myocardial Infarction

The Impact of Self-Incompatibility Systems on the Prevention of Biparental Inbreeding

An Integrative Method to Decode Regulatory Logics in Gene Transcription

The Importance of Selection in the Evolution of Blindness in Cavefish

Cepip: Context-Dependent Epigenomic Weighting for Prioritization of Regulatory Variants and Disease-Associated Genes

The Effect of the Dispersal Kernel on Isolation-By-Distance in a Continuous Population

Data-Based Reconstruction of Complex Geospatial Networks, Nodal Positioning, and Detection of Hidden Nodes