Search Content

Exploring Genetic Associations With ceRNA Regulation in the Human Genome

Description

Competing endogenous RNAs (ceRNAs) are RNA molecules that sequester shared microRNAs (miRNAs) thereby affecting the expression of other targets of the miRNAs. Whether genetic variants in ceRNA can affect its biological function and disease development is still an open question. Here we identified a large number of genetic variants that…

Competing endogenous RNAs (ceRNAs) are RNA molecules that sequester shared microRNAs (miRNAs) thereby affecting the expression of other targets of the miRNAs. Whether genetic variants in ceRNA can affect its biological function and disease development is still an open question. Here we identified a large number of genetic variants that are associated with ceRNA's function using Geuvaids RNA-seq data for 462 individuals from the 1000 Genomes Project. We call these loci competing endogenous RNA expression quantitative trait loci or ‘cerQTL’, and found that a large number of them were unexplored in conventional eQTL mapping. We identified many cerQTLs that have undergone recent positive selection in different human populations, and showed that single nucleotide polymorphisms in gene 3΄UTRs at the miRNA seed binding regions can simultaneously regulate gene expression changes in both cis and trans by the ceRNA mechanism. We also discovered that cerQTLs are significantly enriched in traits/diseases associated variants reported from genome-wide association studies in the miRNA binding sites, suggesting that disease susceptibilities could be attributed to ceRNA regulation. Further in vitro functional experiments demonstrated that a cerQTL rs11540855 can regulate ceRNA function. These results provide a comprehensive catalog of functional non-coding regulatory variants that may be responsible for ceRNA crosstalk at the post-transcriptional level.

ContributorsLi, Mulin Jun (Author) / Zhang, Jian (Author) / Liang, Qian (Author) / Xuan, Chenghao (Author) / Wu, Jiexing (Author) / Jiang, Peng (Author) / Li, Wei (Author) / Zhu, Yun (Author) / Wang, Panwen (Author) / Fernandez, Daniel (Author) / Shen, Yujun (Author) / Chen, Yiwen (Author) / Kocher, Jean-Pierre A. (Author) / Yu, Ying (Author) / Sham, Pak Chung (Author) / Wang, Junwen (Author) / Liu, Jun S. (Author) / Liu, X. Shirley (Author) / College of Health Solutions (Contributor)

Created2017-05-02

Evolution of Drug-Resistant Acinetobacter Baumannii After DCD Renal Transplantation

Description

Infection after renal transplantation remains a major cause of morbidity and death, especially infection from the extensively drug-resistant bacteria, A. baumannii. A total of fourteen A. baumannii isolates were isolated from the donors’ preserved fluid from DCD (donation after cardiac death) renal transplantation and four isolates in the recipients’ draining…

Infection after renal transplantation remains a major cause of morbidity and death, especially infection from the extensively drug-resistant bacteria, A. baumannii. A total of fourteen A. baumannii isolates were isolated from the donors’ preserved fluid from DCD (donation after cardiac death) renal transplantation and four isolates in the recipients’ draining liquid at the Kidney Disease Center, The First Affiliated Hospital, College of Medicine, Zhejiang University, from March 2013 to November 2014. An outbreak of A. baumannii emerging after DCD renal transplantation was tracked to understand the transmission of the pathogen. PFGE displayed similar DNA patterns between isolates from the same hospital. Antimicrobial susceptibility tests against thirteen antimicrobial agents were determined using the K-B diffusion method and eTest. Whole-genome sequencing was applied to investigate the genetic relationship of the isolates. With the clinical data and research results, we concluded that the A. baumannii isolates 3R1 and 3R2 was probably transmitted from the donor who acquired the bacteria during his stay in the ICU, while isolate 4R1 was transmitted from 3R1 and 3R2 via medical manipulation. This study demonstrated the value of integration of clinical profiles with molecular methods in outbreak investigation and their importance in controlling infection and preventing serious complications after DCD transplantation.

ContributorsJiang, Hong (Author) / Cao, Luxi (Author) / Qu, Lihui (Author) / Qu, Tingting (Author) / Liu, Guangjun (Author) / Wang, Rending (Author) / Li, Bingjue (Author) / Wang, Yuchen (Author) / Ying, Chaoqun (Author) / Chen, Miao (Author) / Lu, Yingying (Author) / Feng, Shi (Author) / Xiao, Yonghong (Author) / Wang, Junwen (Author) / Wu, Jianyong (Author) / Chen, Jianghua (Author) / College of Health Solutions (Contributor)

Created2017-05-16

Robust and Rapid Algorithms Facilitate Large-Scale Whole Genome Sequencing Downstream Analysis in an Integrative Framework

Description

Whole genome sequencing (WGS) is a promising strategy to unravel variants or genes responsible for human diseases and traits. However, there is a lack of robust platforms for a comprehensive downstream analysis. In the present study, we first proposed three novel algorithms, sequence gap-filled gene feature annotation, bit-block encoded genotypes…

Whole genome sequencing (WGS) is a promising strategy to unravel variants or genes responsible for human diseases and traits. However, there is a lack of robust platforms for a comprehensive downstream analysis. In the present study, we first proposed three novel algorithms, sequence gap-filled gene feature annotation, bit-block encoded genotypes and sectional fast access to text lines to address three fundamental problems. The three algorithms then formed the infrastructure of a robust parallel computing framework, KGGSeq, for integrating downstream analysis functions for whole genome sequencing data. KGGSeq has been equipped with a comprehensive set of analysis functions for quality control, filtration, annotation, pathogenic prediction and statistical tests. In the tests with whole genome sequencing data from 1000 Genomes Project, KGGSeq annotated several thousand more reliable non-synonymous variants than other widely used tools (e.g. ANNOVAR and SNPEff). It took only around half an hour on a small server with 10 CPUs to access genotypes of ∼60 million variants of 2504 subjects, while a popular alternative tool required around one day. KGGSeq's bit-block genotype format used 1.5% or less space to flexibly represent phased or unphased genotypes with multiple alleles and achieved a speed of over 1000 times faster to calculate genotypic correlation.

ContributorsLi, Miaoxin (Author) / Li, Jiang (Author) / Li, Mulin Jun (Author) / Pan, Zhicheng (Author) / Hsu, Jacob Shujui (Author) / Liu, Dajiang J. (Author) / Zhan, Xiaowei (Author) / Wang, Junwen (Author) / Song, Youqiang (Author) / Sham, Pak Chung (Author) / College of Health Solutions (Contributor)

Created2017-01-23

An Integrative Method to Decode Regulatory Logics in Gene Transcription

Description

Modeling of transcriptional regulatory networks (TRNs) has been increasingly used to dissect the nature of gene regulation. Inference of regulatory relationships among transcription factors (TFs) and genes, especially among multiple TFs, is still challenging. In this study, we introduced an integrative method, LogicTRN, to decode TF–TF interactions that form TF…

Modeling of transcriptional regulatory networks (TRNs) has been increasingly used to dissect the nature of gene regulation. Inference of regulatory relationships among transcription factors (TFs) and genes, especially among multiple TFs, is still challenging. In this study, we introduced an integrative method, LogicTRN, to decode TF–TF interactions that form TF logics in regulating target genes. By combining cis-regulatory logics and transcriptional kinetics into one single model framework, LogicTRN can naturally integrate dynamic gene expression data and TF-DNA-binding signals in order to identify the TF logics and to reconstruct the underlying TRNs. We evaluated the newly developed methodology using simulation, comparison and application studies, and the results not only show their consistence with existing knowledge, but also demonstrate its ability to accurately reconstruct TRNs in biological complex systems.

ContributorsYan, Bin (Author) / Guan, Daogang (Author) / Wang, Chao (Author) / Wang, Junwen (Author) / He, Bing (Author) / Qin, Jing (Author) / Boheler, Kenneth R. (Author) / Lu, Aiping (Author) / Zhang, Ge (Author) / Zhu, Hailong (Author) / College of Health Solutions (Contributor)

Created2017-10-19

Entropy is a Simple Measure of the Antibody Profile and is an Indicator of Health Status: A Proof of Concept

Description

We have previously shown that the diversity of antibodies in an individual can be displayed on chips on which 130,000 peptides chosen from random sequence space have been synthesized. This immunosignature technology is unbiased in displaying antibody diversity relative to natural sequence space, and has been shown to have diagnostic…

We have previously shown that the diversity of antibodies in an individual can be displayed on chips on which 130,000 peptides chosen from random sequence space have been synthesized. This immunosignature technology is unbiased in displaying antibody diversity relative to natural sequence space, and has been shown to have diagnostic and prognostic potential for a wide variety of diseases and vaccines. Here we show that a global measure such as Shannon’s entropy can be calculated for each immunosignature. The immune entropy was measured across a diverse set of 800 people and in 5 individuals over 3 months. The immune entropy is affected by some population characteristics and varies widely across individuals. We find that people with infections or breast cancer, generally have higher entropy values than non-diseased individuals. We propose that the immune entropy as measured from immunosignatures may be a simple method to monitor health in individuals and populations.

ContributorsWang, Lu (Author) / Whittemore, K. (Author) / Johnston, Stephen (Author) / Stafford, Phillip (Author) / Biodesign Institute (Contributor)

Created2017-12-22

Peptide Sequencing Directly on Solid Surfaces Using MALDI Mass Spectrometry

Description

There are an increasing variety of applications in which peptides are both synthesized and used attached to solid surfaces. This has created a need for high throughput sequence analysis directly on surfaces. However, common sequencing approaches that can be adapted to surface bound peptides lack the throughput often needed in…

There are an increasing variety of applications in which peptides are both synthesized and used attached to solid surfaces. This has created a need for high throughput sequence analysis directly on surfaces. However, common sequencing approaches that can be adapted to surface bound peptides lack the throughput often needed in library-based applications. Here we describe a simple approach for sequence analysis directly on solid surfaces that is both high speed and high throughput, utilizing equipment available in most protein analysis facilities. In this approach, surface bound peptides, selectively labeled at their N-termini with a positive charge-bearing group, are subjected to controlled degradation in ammonia gas, resulting in a set of fragments differing by a single amino acid that remain spatially confined on the surface they were bound to. These fragments can then be analyzed by MALDI mass spectrometry, and the peptide sequences read directly from the resulting spectra.

ContributorsZhao, Zhan-Gong (Author) / Cordovez, Lalaine Anne (Author) / Johnston, Stephen (Author) / Woodbury, Neal (Author) / Biodesign Institute (Contributor)

Created2017-12-19

Evidence-Based Transit and Land Use Sketch Planning Using Interactive Accessibility Methods on Combined Schedule and Headway-Based Networks

Description

There is a need for indicators of transportation-land use system quality that are understandable to a wide range of stakeholders, and which can provide immediate feedback on the quality of interactively designed scenarios. Location-based accessibility indicators are promising candidates, but indicator values can vary strongly depending on time of day…

There is a need for indicators of transportation-land use system quality that are understandable to a wide range of stakeholders, and which can provide immediate feedback on the quality of interactively designed scenarios. Location-based accessibility indicators are promising candidates, but indicator values can vary strongly depending on time of day and transfer wait times. Capturing this variation increases complexity, slowing down calculations. We present new methods for rapid yet rigorous computation of accessibility metrics, allowing immediate feedback during early-stage transit planning, while being rigorous enough for final analyses. Our approach is statistical, characterizing the uncertainty and variability in accessibility metrics due to differences in departure time and headway-based scenario specification. The analysis is carried out on a detailed multi-modal network model including both public transportation and streets. Land use data are represented at high resolution. These methods have been implemented as open-source software running on commodity cloud infrastructure. Networks are constructed from standard open data sources, and scenarios are built in a map-based web interface. We conclude with a case study, describing how these methods were applied in a long-term transportation planning process for metropolitan Amsterdam.

ContributorsConway, Matthew Wigginton (Author) / Byrd, Andrew (Author) / van der Linden, Marco (Author)

Created2017

Accounting for Uncertainty and Variation in Accessibility Metrics for Public Transport Sketch Planning

Description

Accessibility is increasingly used as a metric when evaluating changes to public transport systems. Transit travel times contain variation depending on when one departs relative to when a transit vehicle arrives, and how well transfers are coordinated given a particular timetable. In addition, there is necessarily uncertainty in the value…

Accessibility is increasingly used as a metric when evaluating changes to public transport systems. Transit travel times contain variation depending on when one departs relative to when a transit vehicle arrives, and how well transfers are coordinated given a particular timetable. In addition, there is necessarily uncertainty in the value of the accessibility metric during sketch planning processes, due to scenarios which are underspecified because detailed schedule information is not yet available. This article presents a method to extend the concept of "reliable" accessibility to transit to address the first issue, and create confidence intervals and hypothesis tests to address the second.

ContributorsConway, Matthew Wigginton (Author) / Byrd, Andrew (Author) / van Eggermond, Michael (Author)

Created2018-07-23

A Simple Platform for the Rapid Development of Antimicrobials

Description

Recent infectious outbreaks highlight the need for platform technologies that can be quickly deployed to develop therapeutics needed to contain the outbreak. We present a simple concept for rapid development of new antimicrobials. The goal was to produce in as little as one week thousands of doses of an intervention…

Recent infectious outbreaks highlight the need for platform technologies that can be quickly deployed to develop therapeutics needed to contain the outbreak. We present a simple concept for rapid development of new antimicrobials. The goal was to produce in as little as one week thousands of doses of an intervention for a new pathogen. We tested the feasibility of a system based on antimicrobial synbodies. The system involves creating an array of 100 peptides that have been selected for broad capability to bind and/or kill viruses and bacteria. The peptides are pre-screened for low cell toxicity prior to large scale synthesis. Any pathogen is then assayed on the chip to find peptides that bind or kill it. Peptides are combined in pairs as synbodies and further screened for activity and toxicity. The lead synbody can be quickly produced in large scale, with completion of the entire process in one week.

ContributorsJohnston, Stephen (Author) / Domenyuk, Valeriy (Author) / Gupta, Nidhi (Author) / Tavares Batista, Milene (Author) / Lainson, John (Author) / Zhao, Zhan-Gong (Author) / Lusk, Joel (Author) / Loskutov, Andrey (Author) / Cichacz, Zbigniew (Author) / Stafford, Phillip (Author) / Legutki, Joseph Barten (Author) / Diehnelt, Chris (Author) / Biodesign Institute (Contributor)

Created2017-12-14

Linnorm: Improved Statistical Analysis for Single Cell RNA-seq Expression Data

Description

Linnorm is a novel normalization and transformation method for the analysis of single cell RNA sequencing (scRNA-seq) data. Linnorm is developed to remove technical noises and simultaneously preserve biological variations in scRNA-seq data, such that existing statistical methods can be improved. Using real scRNA-seq data, we compared Linnorm with existing…

Linnorm is a novel normalization and transformation method for the analysis of single cell RNA sequencing (scRNA-seq) data. Linnorm is developed to remove technical noises and simultaneously preserve biological variations in scRNA-seq data, such that existing statistical methods can be improved. Using real scRNA-seq data, we compared Linnorm with existing normalization methods, including NODES, SAMstrt, SCnorm, scran, DESeq and TMM. Linnorm shows advantages in speed, technical noise removal and preservation of cell heterogeneity, which can improve existing methods in the discovery of novel subtypes, pseudo-temporal ordering of cells, clustering analysis, etc. Linnorm also performs better than existing DEG analysis methods, including BASiCS, NODES, SAMstrt, Seurat and DESeq2, in false positive rate control and accuracy.

ContributorsYip, Shun H. (Author) / Wang, Panwen (Author) / Kocher, Jean-Pierre A. (Author) / Sham, Pak Chung (Author) / Wang, Junwen (Author) / College of Health Solutions (Contributor)

Created2017-09-18