Matching Items (8)
130295-Thumbnail Image.png
Description

Cancer is sometimes depicted as a reversion to single cell behavior in cells adapted to live in a multicellular assembly. If this is the case, one would expect that mutation in cancer disrupts functional mechanisms that suppress cell-level traits detrimental to multicellularity. Such mechanisms should have evolved with or after

Cancer is sometimes depicted as a reversion to single cell behavior in cells adapted to live in a multicellular assembly. If this is the case, one would expect that mutation in cancer disrupts functional mechanisms that suppress cell-level traits detrimental to multicellularity. Such mechanisms should have evolved with or after the emergence of multicellularity. This leads to two related, but distinct hypotheses: 1) Somatic mutations in cancer will occur in genes that are younger than the emergence of multicellularity (1000 million years [MY]); and 2) genes that are frequently mutated in cancer and whose mutations are functionally important for the emergence of the cancer phenotype evolved within the past 1000 million years, and thus would exhibit an age distribution that is skewed to younger genes. In order to investigate these hypotheses we estimated the evolutionary ages of all human genes and then studied the probability of mutation and their biological function in relation to their age and genomic location for both normal germline and cancer contexts.

We observed that under a model of uniform random mutation across the genome, controlled for gene size, genes less than 500 MY were more frequently mutated in both cases. Paradoxically, causal genes, defined in the COSMIC Cancer Gene Census, were depleted in this age group. When we used functional enrichment analysis to explain this unexpected result we discovered that COSMIC genes with recessive disease phenotypes were enriched for DNA repair and cell cycle control. The non-mutated genes in these pathways are orthologous to those underlying stress-induced mutation in bacteria, which results in the clustering of single nucleotide variations. COSMIC genes were less common in regions where the probability of observing mutational clusters is high, although they are approximately 2-fold more likely to harbor mutational clusters compared to other human genes. Our results suggest this ancient mutational response to stress that evolved among prokaryotes was co-opted to maintain diversity in the germline and immune system, while the original phenotype is restored in cancer. Reversion to a stress-induced mutational response is a hallmark of cancer that allows for effectively searching “protected” genome space where genes causally implicated in cancer are located and underlies the high adaptive potential and concomitant therapeutic resistance that is characteristic of cancer.

Created2017-04-25
130273-Thumbnail Image.png
Description
Gene expression patterns assayed across development can offer key clues about a gene’s function and regulatory role. Drosophila melanogaster is ideal for such investigations as multiple individual and high-throughput efforts have captured the spatiotemporal patterns of thousands of embryonic expressed genes in the form of in situ images. FlyExpress (www.flyexpress.net),

Gene expression patterns assayed across development can offer key clues about a gene’s function and regulatory role. Drosophila melanogaster is ideal for such investigations as multiple individual and high-throughput efforts have captured the spatiotemporal patterns of thousands of embryonic expressed genes in the form of in situ images. FlyExpress (www.flyexpress.net), a knowledgebase based on a massive and unique digital library of standardized images and a simple search engine to find coexpressed genes, was created to facilitate the analytical and visual mining of these patterns. Here, we introduce the next generation of FlyExpress resources to facilitate the integrative analysis of sequence data and spatiotemporal patterns of expression from images. FlyExpress 7 now includes over 100,000 standardized in situ images and implements a more efficient, user-defined search algorithm to identify coexpressed genes via Genomewide Expression Maps (GEMs). Shared motifs found in the upstream 5′ regions of any pair of coexpressed genes can be visualized in an interactive dotplot. Additional webtools and link-outs to assist in the downstream validation of candidate motifs are also provided. Together, FlyExpress 7 represents our largest effort yet to accelerate discovery via the development and dispersal of new webtools that allow researchers to perform data-driven analyses of coexpression (image) and genomic (sequence) data.
ContributorsKumar, Sudhir (Author) / Konikoff, Charlotte (Author) / Sanderford, Maxwell (Author) / Liu, Li (Author) / Newfeld, Stuart (Author) / Ye, Jieping (Author) / Kulathinal, Rob J. (Author) / College of Health Solutions (Contributor) / Department of Biomedical Informatics (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor)
Created2017-06-30
129395-Thumbnail Image.png
Description

Vesta is a unique, intermediate class of rocky body in the Solar System, between terrestrial planets and small asteroids, because of its size (average radius of ∼263 km) and differentiation, with a crust, mantle and core. Vesta’s low surface gravity (0.25 m/s2) has led to the continual absence of a

Vesta is a unique, intermediate class of rocky body in the Solar System, between terrestrial planets and small asteroids, because of its size (average radius of ∼263 km) and differentiation, with a crust, mantle and core. Vesta’s low surface gravity (0.25 m/s2) has led to the continual absence of a protective atmosphere and consequently impact cratering and impact-related processes are prevalent. Previous work has shown that the formation of the Rheasilvia impact basin induced the equatorial Divalia Fossae, whereas the formation of the Veneneia impact basin induced the northern Saturnalia Fossae. Expanding upon this earlier work, we conducted photogeologic mapping of the Saturnalia Fossae, adjacent structures and geomorphic units in two of Vesta’s northern quadrangles: Caparronia and Domitia. Our work indicates that impact processes created and/or modified all mapped structures and geomorphic units. The mapped units, ordered from oldest to youngest age based mainly on cross-cutting relationships, are: (1) Vestalia Terra unit, (2) cratered highlands unit, (3) Saturnalia Fossae trough unit, (4) Saturnalia Fossae cratered unit, (5) undifferentiated ejecta unit, (6) dark lobate unit, (7) dark crater ray unit and (8) lobate crater unit. The Saturnalia Fossae consist of five separate structures: Saturnalia Fossa A is the largest (maximum width of ∼43 km) and is interpreted as a graben, whereas Saturnalia Fossa B-E are smaller (maximum width of ∼15 km) and are interpreted as half grabens formed by synthetic faults. Smaller, second-order structures (maximum width of <1 km) are distinguished from the Saturnalia Fossae, a first-order structure, by the use of the general descriptive term ‘adjacent structures’, which encompasses minor ridges, grooves and crater chains. For classification purposes, the general descriptive term ‘minor ridges’ characterizes ridges that are not part of the Saturnalia Fossae and are an order of magnitude smaller (maximum width of <1 km vs. maximum width of ∼43 km). Shear deformation resulting from the large-scale (diameter of <100 km) Rheasilvia impact is proposed to form minor ridges (∼2 km to ∼25 km in length), which are interpreted as the surface expression of thrust faults, as well as grooves (∼3 km to ∼25 km in length) and pit crater chains (∼1 km to ∼25 km in length), which are interpreted as the surface expression of extension fractures and/or dilational normal faults. Secondary crater material, ejected from small-scale and medium-scale impacts (diameters of <100 km), are interpreted to form ejecta ray systems of grooves and crater chains by bouncing and scouring across the surface. Furthermore, seismic shaking, also resulting from small-scale and medium-scale impacts, is interpreted to form minor ridges because seismic shaking induces flow of regolith, which subsequently accumulates as minor ridges that are roughly parallel to the regional slope. In this work we expand upon the link between impact processes and structural features on Vesta by presenting findings of a photogeologic, structural mapping study which highlights how impact cratering and impact-related processes are expressed on this unique, intermediate Solar System body.

ContributorsScully, Jennifer E. C. (Author) / Yin, A. (Author) / Russell, C. T. (Author) / Buczkowski, D. L. (Author) / Williams, David (Author) / Blewett, D. T. (Author) / Ruesch, O. (Author) / Hiesinger, H. (Author) / Le Corre, L. (Author) / Mercer, Cameron (Author) / Yingst, R. A. (Author) / Garry, W. B. (Author) / Jaumann, R. (Author) / Roatsch, T. (Author) / Preusker, F. (Author) / Gaskell, R.W. (Author) / Schroder, S.E. (Author) / Ammannito, E. (Author) / Pieters, C. M. (Author) / Raymond, C. A. (Author) / DREAM 9 AML-OPC Consortium (Contributor)
Created2014-01-29
128387-Thumbnail Image.png
Description

In cognitive science, the rational analysis framework allows modelling of how physical and social environments impose information-processing demands onto cognitive systems. In humans, for example, past social contact among individuals predicts their future contact with linear and power functions. These features of the human environment constrain the optimal way to

In cognitive science, the rational analysis framework allows modelling of how physical and social environments impose information-processing demands onto cognitive systems. In humans, for example, past social contact among individuals predicts their future contact with linear and power functions. These features of the human environment constrain the optimal way to remember information and probably shape how memory records are retained and retrieved. We offer a primer on how biologists can apply rational analysis to study animal behaviour. Using chimpanzees (Pan troglodytes) as a case study, we modelled 19 years of observational data on their social contact patterns. Much like humans, the frequency of past encounters in chimpanzees linearly predicted future encounters, and the recency of past encounters predicted future encounters with a power function. Consistent with the rational analyses carried out for human memory, these findings suggest that chimpanzee memory performance should reflect those environmental regularities. In re-analysing existing chimpanzee memory data, we found that chimpanzee memory patterns mirrored their social contact patterns. Our findings hint that human and chimpanzee memory systems may have evolved to solve similar information-processing problems. Overall, rational analysis offers novel theoretical and methodological avenues for the comparative study of cognition.

ContributorsStevens, Jeffrey R. (Author) / Marewski, Julian N. (Author) / Gilby, Ian (Author) / DREAM 9 AML-OPC Consortium (Contributor)
Created2016-08-03
128381-Thumbnail Image.png
Description

Objectives: Prediabetes is a major epidemic and is associated with adverse cardio-cerebrovascular outcomes. Early identification of patients who will develop rapid progression of atherosclerosis could be beneficial for improved risk stratification. In this paper, we investigate important factors impacting the prediction, using several machine learning methods, of rapid progression of carotid

Objectives: Prediabetes is a major epidemic and is associated with adverse cardio-cerebrovascular outcomes. Early identification of patients who will develop rapid progression of atherosclerosis could be beneficial for improved risk stratification. In this paper, we investigate important factors impacting the prediction, using several machine learning methods, of rapid progression of carotid intima-media thickness in impaired glucose tolerance (IGT) participants.

Methods: In the Actos Now for Prevention of Diabetes (ACT NOW) study, 382 participants with IGT underwent carotid intima-media thickness (CIMT) ultrasound evaluation at baseline and at 15–18 months, and were divided into rapid progressors (RP, n = 39, 58 ± 17.5 μM change) and non-rapid progressors (NRP, n = 343, 5.8 ± 20 μM change, p < 0.001 versus RP). To deal with complex multi-modal data consisting of demographic, clinical, and laboratory variables, we propose a general data-driven framework to investigate the ACT NOW dataset. In particular, we first employed a Fisher Score-based feature selection method to identify the most effective variables and then proposed a probabilistic Bayes-based learning method for the prediction. Comparison of the methods and factors was conducted using area under the receiver operating characteristic curve (AUC) analyses and Brier score.

Results: The experimental results show that the proposed learning methods performed well in identifying or predicting RP. Among the methods, the performance of Naïve Bayes was the best (AUC 0.797, Brier score 0.085) compared to multilayer perceptron (0.729, 0.086) and random forest (0.642, 0.10). The results also show that feature selection has a significant positive impact on the data prediction performance.

Conclusions: By dealing with multi-modal data, the proposed learning methods show effectiveness in predicting prediabetics at risk for rapid atherosclerosis progression. The proposed framework demonstrated utility in outcome prediction in a typical multidimensional clinical dataset with a relatively small number of subjects, extending the potential utility of machine learning approaches beyond extremely large-scale datasets.

ContributorsHu, Xia (Author) / Reaven, Peter (Author) / Saremi, Aramesh (Author) / Liu, Ninghao (Author) / Abbasi, Mohammad (Author) / Liu, Huan (Author) / Migrino, Raymond Q. (Author) / DREAM 9 AML-OPC Consortium (Contributor)
Created2016-09-05
128866-Thumbnail Image.png
Description

Childhood apraxia of speech (CAS) is a severe and socially debilitating form of speech sound disorder with suspected genetic involvement, but the genetic etiology is not yet well understood. Very few known or putative causal genes have been identified to date, e.g., FOXP2 and BCL11A. Building a knowledge base of

Childhood apraxia of speech (CAS) is a severe and socially debilitating form of speech sound disorder with suspected genetic involvement, but the genetic etiology is not yet well understood. Very few known or putative causal genes have been identified to date, e.g., FOXP2 and BCL11A. Building a knowledge base of the genetic etiology of CAS will make it possible to identify infants at genetic risk and motivate the development of effective very early intervention programs. We investigated the genetic etiology of CAS in two large multigenerational families with familial CAS. Complementary genomic methods included Markov chain Monte Carlo linkage analysis, copy-number analysis, identity-by-descent sharing, and exome sequencing with variant filtering. No overlaps in regions with positive evidence of linkage between the two families were found. In one family, linkage analysis detected two chromosomal regions of interest, 5p15.1-p14.1, and 17p13.1-q11.1, inherited separately from the two founders. Single-point linkage analysis of selected variants identified CDH18 as a primary gene of interest and additionally, MYO10, NIPBL, GLP2R, NCOR1, FLCN, SMCR8, NEK8, and ANKRD12, possibly with additive effects. Linkage analysis in the second family detected five regions with LOD scores approaching the highest values possible in the family. A gene of interest was C4orf21 (ZGRF1) on 4q25-q28.2. Evidence for previously described causal copy-number variations and validated or suspected genes was not found. Results are consistent with a heterogeneous CAS etiology, as is expected in many neurogenic disorders. Future studies will investigate genome variants in these and other families with CAS.

ContributorsPeter, Beate (Author) / Wijsman, Ellen M. (Author) / Nato, Alejandro Q. (Author) / Matsushita, Mark M. (Author) / Chapman, Kathy L. (Author) / Stanaway, Ian B. (Author) / Wolff, John (Author) / Oda, Kaori (Author) / Gabo, Virginia B. (Author) / Raskind, Wendy H. (Author) / DREAM 9 AML-OPC Consortium (Contributor)
Created2016-04-27
128659-Thumbnail Image.png
Description

Temporal transcriptions of genes are achieved by different mechanisms such as dynamic interaction of activator and repressor proteins with promoters, and accumulation and/or degradation of key regulators as a function of cell cycle. We find that the TorR protein localizes to the old poles of the Escherichia coli cells, forming

Temporal transcriptions of genes are achieved by different mechanisms such as dynamic interaction of activator and repressor proteins with promoters, and accumulation and/or degradation of key regulators as a function of cell cycle. We find that the TorR protein localizes to the old poles of the Escherichia coli cells, forming a functional focus. The TorR focus co-localizes with the nucleoid in a cell-cycle-dependent manner, and consequently regulates transcription of a number of genes. Formation of one TorR focus at the old poles of cells requires interaction with the MreB and DnaK proteins, and ATP, suggesting that TorR delivery requires cytoskeleton organization and ATP. Further, absence of the protein–protein interactions and ATP leads to loss in function of TorR as a transcription factor. We propose a mechanism for timing of cell-cycle-dependent gene transcription, where a transcription factor interacts with its target genes during a specific period of the cell cycle by limiting its own spatial distribution.

ContributorsYao, Yuan (Author) / Fan, Lifei (Author) / Shi, Yixin (Author) / Odsbu, Ingvild (Author) / DREAM 9 AML-OPC Consortium (Contributor)
Created2016-12-23
128057-Thumbnail Image.png
Description

Acute Myeloid Leukemia (AML) is a fatal hematological cancer. The genetic abnormalities underlying AML are extremely heterogeneous among patients, making prognosis and treatment selection very difficult. While clinical proteomics data has the potential to improve prognosis accuracy, thus far, the quantitative means to do so have yet to be developed.

Acute Myeloid Leukemia (AML) is a fatal hematological cancer. The genetic abnormalities underlying AML are extremely heterogeneous among patients, making prognosis and treatment selection very difficult. While clinical proteomics data has the potential to improve prognosis accuracy, thus far, the quantitative means to do so have yet to be developed. Here we report the results and insights gained from the DREAM 9 Acute Myeloid Prediction Outcome Prediction Challenge (AML-OPC), a crowdsourcing effort designed to promote the development of quantitative methods for AML prognosis prediction. We identify the most accurate and robust models in predicting patient response to therapy, remission duration, and overall survival. We further investigate patient response to therapy, a clinically actionable prediction, and find that patients that are classified as resistant to therapy are harder to predict than responsive patients across the 31 models submitted to the challenge. The top two performing models, which held a high sensitivity to these patients, substantially utilized the proteomics data to make predictions. Using these models, we also identify which signaling proteins were useful in predicting patient therapeutic response.

ContributorsNoren, David P. (Author) / Long, Byron L. (Author) / Norel, Raquel (Author) / Rrhissorrakrai, Kahn (Author) / Hess, Kenneth (Author) / Hu, Chenyue Wendy (Author) / Bisberg, Alex J. (Author) / Schultz, Andre (Author) / Engquist, Erik (Author) / Lin, Xihui (Author) / Chen, Gregory M. (Author) / Xie, Honglei (Author) / Hunter, Geoffrey A. M. (Author) / Boutros, Paul C. (Author) / Stepanov, Oleg (Author) / Norman, Thea (Author) / Friend, Stephen H. (Author) / Stolovitzky, Gustavo (Author) / Qutub, Amina A. (Author) / DREAM 9 AML-OPC Consortium (Author) / College of Health Solutions (Contributor) / Department of Biomedical Informatics (Contributor)
Created2016-06-28