Filtering by
- Member of: Programs and Communities
Improvements in sequencing technology now allow easy acquisition of large datasets; however, analyzing these data for phylogenetics can be challenging. We have developed a novel method to rapidly obtain homologous genomic data for phylogenetics directly from next-generation sequencing reads without the use of a reference genome. This software, called SISRS, avoids the time consuming steps of de novo whole genome assembly, multiple genome alignment, and annotation.
Results
For simulations SISRS is able to identify large numbers of loci containing variable sites with phylogenetic signal. For genomic data from apes, SISRS identified thousands of variable sites, from which we produced an accurate phylogeny. Finally, we used SISRS to identify phylogenetic markers that we used to estimate the phylogeny of placental mammals. We recovered eight phylogenies that resolved the basal relationships among mammals using datasets with different levels of missing data. The three alternate resolutions of the basal relationships are consistent with the major hypotheses for the relationships among mammals, all of which have been supported previously by different molecular datasets.
Conclusions
SISRS has the potential to transform phylogenetic research. This method eliminates the need for expensive marker development in many studies by using whole genome shotgun sequence data directly. SISRS is open source and freely available at https://github.com/rachelss/SISRS/releases.
Serial femtosecond crystallography requires reliable and efficient delivery of fresh crystals across the beam of an X-ray free-electron laser over the course of an experiment. We introduce a double-flow focusing nozzle to meet this challenge, with significantly reduced sample consumption, while improving jet stability over previous generations of nozzles. We demonstrate its use to determine the first room-temperature structure of RNA polymerase II at high resolution, revealing new structural details. Moreover, the double flow-focusing nozzles were successfully tested with three other protein samples and the first room temperature structure of an extradiol ring-cleaving dioxygenase was solved by utilizing the improved operation and characteristics of these devices.
X-ray free-electron lasers provide novel opportunities to conduct single particle analysis on nanoscale particles. Coherent diffractive imaging experiments were performed at the Linac Coherent Light Source (LCLS), SLAC National Laboratory, exposing single inorganic core-shell nanoparticles to femtosecond hard-X-ray pulses. Each facetted nanoparticle consisted of a crystalline gold core and a differently shaped palladium shell. Scattered intensities were observed up to about 7 nm resolution. Analysis of the scattering patterns revealed the size distribution of the samples, which is consistent with that obtained from direct real-space imaging by electron microscopy. Scattering patterns resulting from single particles were selected and compiled into a dataset which can be valuable for algorithm developments in single particle scattering research.
Single particle diffractive imaging data from Rice Dwarf Virus (RDV) were recorded using the Coherent X-ray Imaging (CXI) instrument at the Linac Coherent Light Source (LCLS). RDV was chosen as it is a well-characterized model system, useful for proof-of-principle experiments, system optimization and algorithm development. RDV, an icosahedral virus of about 70 nm in diameter, was aerosolized and injected into the approximately 0.1 μm diameter focused hard X-ray beam at the CXI instrument of LCLS. Diffraction patterns from RDV with signal to 5.9 Ångström were recorded. The diffraction data are available through the Coherent X-ray Imaging Data Bank (CXIDB) as a resource for algorithm development, the contents of which are described here.
In the weeks following the first imported case of Ebola in the U. S. on September 29, 2014, coverage of the very limited outbreak dominated the news media, in a manner quite disproportionate to the actual threat to national public health; by the end of October, 2014, there were only four laboratory confirmed cases of Ebola in the entire nation. Public interest in these events was high, as reflected in the millions of Ebola-related Internet searches and tweets performed in the month following the first confirmed case. Use of trending Internet searches and tweets has been proposed in the past for real-time prediction of outbreaks (a field referred to as “digital epidemiology”), but accounting for the biases of public panic has been problematic. In the case of the limited U. S. Ebola outbreak, we know that the Ebola-related searches and tweets originating the U. S. during the outbreak were due only to public interest or panic, providing an unprecedented means to determine how these dynamics affect such data, and how news media may be driving these trends.
Methodology
We examine daily Ebola-related Internet search and Twitter data in the U. S. during the six week period ending Oct 31, 2014. TV news coverage data were obtained from the daily number of Ebola-related news videos appearing on two major news networks. We fit the parameters of a mathematical contagion model to the data to determine if the news coverage was a significant factor in the temporal patterns in Ebola-related Internet and Twitter data.
Conclusions
We find significant evidence of contagion, with each Ebola-related news video inspiring tens of thousands of Ebola-related tweets and Internet searches. Between 65% to 76% of the variance in all samples is described by the news media contagion model.
Seroepidemiological studies before and after the epidemic wave of H1N1-2009 are useful for estimating population attack rates with a potential to validate early estimates of the reproduction number, R, in modeling studies.
Methodology/Principal Findings
Since the final epidemic size, the proportion of individuals in a population who become infected during an epidemic, is not the result of a binomial sampling process because infection events are not independent of each other, we propose the use of an asymptotic distribution of the final size to compute approximate 95% confidence intervals of the observed final size. This allows the comparison of the observed final sizes against predictions based on the modeling study (R = 1.15, 1.40 and 1.90), which also yields simple formulae for determining sample sizes for future seroepidemiological studies. We examine a total of eleven published seroepidemiological studies of H1N1-2009 that took place after observing the peak incidence in a number of countries. Observed seropositive proportions in six studies appear to be smaller than that predicted from R = 1.40; four of the six studies sampled serum less than one month after the reported peak incidence. The comparison of the observed final sizes against R = 1.15 and 1.90 reveals that all eleven studies appear not to be significantly deviating from the prediction with R = 1.15, but final sizes in nine studies indicate overestimation if the value R = 1.90 is used.
Conclusions
Sample sizes of published seroepidemiological studies were too small to assess the validity of model predictions except when R = 1.90 was used. We recommend the use of the proposed approach in determining the sample size of post-epidemic seroepidemiological studies, calculating the 95% confidence interval of observed final size, and conducting relevant hypothesis testing instead of the use of methods that rely on a binomial proportion.