This growing collection consists of scholarly works authored by ASU-affiliated faculty, staff, and community members, and it contains many open access articles. ASU-affiliated authors are encouraged to Share Your Work in KEEP.

Displaying 1 - 10 of 23
Filtering by

Clear all filters

Description

Background: The shift from solitary to social behavior is one of the major evolutionary transitions. Primitively eusocial bumblebees are uniquely placed to illuminate the evolution of highly eusocial insect societies. Bumblebees are also invaluable natural and agricultural pollinators, and there is widespread concern over recent population declines in some species. High-quality

Background: The shift from solitary to social behavior is one of the major evolutionary transitions. Primitively eusocial bumblebees are uniquely placed to illuminate the evolution of highly eusocial insect societies. Bumblebees are also invaluable natural and agricultural pollinators, and there is widespread concern over recent population declines in some species. High-quality genomic data will inform key aspects of bumblebee biology, including susceptibility to implicated population viability threats.

Results: We report the high quality draft genome sequences of Bombus terrestris and Bombus impatiens, two ecologically dominant bumblebees and widely utilized study species. Comparing these new genomes to those of the highly eusocial honeybee Apis mellifera and other Hymenoptera, we identify deeply conserved similarities, as well as novelties key to the biology of these organisms. Some honeybee genome features thought to underpin advanced eusociality are also present in bumblebees, indicating an earlier evolution in the bee lineage. Xenobiotic detoxification and immune genes are similarly depauperate in bumblebees and honeybees, and multiple categories of genes linked to social organization, including development and behavior, show high conservation. Key differences identified include a bias in bumblebee chemoreception towards gustation from olfaction, and striking differences in microRNAs, potentially responsible for gene regulation underlying social and other traits.

Conclusions: These two bumblebee genomes provide a foundation for post-genomic research on these key pollinators and insect societies. Overall, gene repertoires suggest that the route to advanced eusociality in bees was mediated by many small changes in many genes and processes, and not by notable expansion or depauperation.

ContributorsSadd, Ben M. (Author) / Barribeau, Seth M. (Author) / Bloch, Guy (Author) / de Graaf, Dirk C. (Author) / Dearden, Peter (Author) / Elsik, Christine G. (Author) / Gadau, Juergen (Author) / Grimmelikhuijzen, Cornelis J. P. (Author) / Hasselmann, Martin (Author) / Lozier, Jeffrey D. (Author) / Robertson, Hugh M. (Author) / Smagghe, Guy (Author) / Stolle, Eckart (Author) / Van Vaerenbergh, Matthias (Author) / Waterhouse, Robert M. (Author) / Bornberg-Bauer, Erich (Author) / Klasberg, Steffen (Author) / Bennett, Anna K. (Author) / Camara, Francisco (Author) / Guigo, Roderic (Author) / Hoff, Katharina (Author) / Mariotti, Marco (Author) / Munoz-Torres, Monica (Author) / Murphy, Terence (Author) / Santesmasses, Didac (Author) / Amdam, Gro (Author) / Beckers, Matthew (Author) / Beye, Martin (Author) / Biewer, Matthias (Author) / Bitondi, Marcia MG (Author) / Blaxter, Mark L. (Author) / Bourke, Andrew FG (Author) / Brown, Mark JF (Author) / Buechel, Severine D. (Author) / Cameron, Rossanah (Author) / Cappelle, Kaat (Author) / Carolan, James C. (Author) / Christiaens, Olivier (Author) / Ciborowski, Kate L. (Author) / Clarke, David F. (Author) / Colgan, Thomas J. (Author) / Collins, David H. (Author) / Cridge, Andrew G. (Author) / Dalmay, Tamas (Author) / Dreier, Stephanie (Author) / du Plessis, Louis (Author) / Duncan, Elizabeth (Author) / Erler, Silvio (Author) / Evans, Jay (Author) / Falcon, Talgo (Author) / Flores, Kevin (Author) / Freitas, Flavia CP (Author) / Fuchikawa, Taro (Author) / Gempe, Tanja (Author) / Hartfelder, Klaus (Author) / Hauser, Frank (Author) / Helbing, Sophie (Author) / Humann, Fernanda (Author) / Irvine, Frano (Author) / Jermiin, Lars S (Author) / Johnson, Claire E. (Author) / Johnson, Reed M (Author) / Jones, Andrew K. (Author) / Kadowaki, Tatsuhiko (Author) / Kidner, Jonathan H. (Author) / Koch, Vasco (Author) / Kohler, Arian (Author) / Kraus, F. Bernhard (Author) / Lattorff, H. Michael G. (Author) / Leask, Megan (Author) / Lockett, Gabrielle A. (Author) / Mallon, Eamonn B. (Author) / Marco Antonio, David S. (Author) / Marxer, Monika (Author) / Meeus, Ivan (Author) / Moritz, Robin FA (Author) / Nair, Ajay (Author) / Napflin, Kathrin (Author) / Nissen, Inga (Author) / Niu, Jinzhi (Author) / Nunes, Francis MF (Author) / Oakeshott, John G. (Author) / Osborne, Amy (Author) / Otte, Marianne (Author) / Pinheiro, Daniel G. (Author) / Rossie, Nina (Author) / Rueppell, Olav (Author) / Santos, Carolina G (Author) / Schmid-Hempel, Regula (Author) / Schmitt, Bjorn D. (Author) / Schulte, Christina (Author) / Simoes, Zila LP (Author) / Soares, Michelle PM (Author) / Swevers, Luc (Author) / Winnebeck, Eva C. (Author) / Wolschin, Florian (Author) / Yu, Na (Author) / Zdobnov, Evgeny M (Author) / Aqrawi, Peshtewani K (Author) / Blakenburg, Kerstin P (Author) / Coyle, Marcus (Author) / Francisco, Liezl (Author) / Hernandez, Alvaro G. (Author) / Holder, Michael (Author) / Hudson, Matthew E. (Author) / Jackson, LaRonda (Author) / Jayaseelan, Joy (Author) / Joshi, Vandita (Author) / Kovar, Christie (Author) / Lee, Sandra L. (Author) / Mata, Robert (Author) / Mathew, Tittu (Author) / Newsham, Irene F. (Author) / Ngo, Robin (Author) / Okwuonu, Geoffrey (Author) / Pham, Christopher (Author) / Pu, Ling-Ling (Author) / Saada, Nehad (Author) / Santibanez, Jireh (Author) / Simmons, DeNard (Author) / Thornton, Rebecca (Author) / Venkat, Aarti (Author) / Walden, Kimberly KO (Author) / Wu, Yuan-Qing (Author) / Debyser, Griet (Author) / Devreese, Bart (Author) / Asher, Claire (Author) / Blommaert, Julie (Author) / Chipman, Ariel D. (Author) / Chittka, Lars (Author) / Fouks, Bertrand (Author) / Liu, Jisheng (Author) / O'Neill, Meaghan P (Author) / Sumner, Seirian (Author) / Puiu, Daniela (Author) / Qu, Jiaxin (Author) / Salzberg, Steven L (Author) / Scherer, Steven E (Author) / Muzny, Donna M. (Author) / Richards, Stephen (Author) / Robinson, Gene E (Author) / Gibbs, Richard A. (Author) / Schmid-Hempel, Paul (Author) / Worley, Kim C (Author) / College of Liberal Arts and Sciences (Contributor)
Created2015-04-24
Description

We present a phylogeographic study of at least six reproductively isolated lineages of new world harvester ants within the Pogonomyrmex barbatus and P. rugosus species group. The genetic and geographic relationships within this clade are complex: Four of the identified lineages show genetic caste determination (GCD) and are divided into

We present a phylogeographic study of at least six reproductively isolated lineages of new world harvester ants within the Pogonomyrmex barbatus and P. rugosus species group. The genetic and geographic relationships within this clade are complex: Four of the identified lineages show genetic caste determination (GCD) and are divided into two pairs. Each pair has evolved under a mutualistic system that necessitates sympatry. These paired lineages are dependent upon one another because their GCD requires interlineage matings for the production of F1 hybrid workers, and intralineage matings are required to produce queens. This GCD system maintains genetic isolation among these interdependent lineages, while simultaneously requiring co-expansion and emigration as their distributions have changed over time. It has also been demonstrated that three of these four GCD lineages have undergone historical hybridization, but the narrower sampling range of previous studies has left questions on the hybrid parentage, breadth, and age of these groups. Thus, reconstructing the phylogenetic and geographic history of this group allows us to evaluate past insights and hypotheses and to plan future inquiries in a more complete historical biogeographic context. Using mitochondrial DNA sequences sampled across most of the morphospecies’ ranges in the U.S.A. and Mexico, we conducted a detailed phylogeographic study. Remarkably, our results indicate that one of the GCD lineage pairs has experienced a dramatic range expansion, despite the genetic load and fitness costs of the GCD system. Our analyses also reveal a complex pattern of vicariance and dispersal in Pogonomyrmex harvester ants that is largely concordant with models of late Miocene, Pliocene, and Pleistocene range shifts among various arid-adapted taxa in North America.

ContributorsMott, Brendon (Author) / Gadau, Juergen (Author) / Anderson, Kirk E. (Author) / College of Liberal Arts and Sciences (Contributor)
Created2015-07-01
128253-Thumbnail Image.png
Description

The number and variety of connectivity estimation methods is likely to continue to grow over the coming decade. Comparisons between methods are necessary to prune this growth to only the most accurate and robust methods. However, the nature of connectivity is elusive with different methods potentially attempting to identify different

The number and variety of connectivity estimation methods is likely to continue to grow over the coming decade. Comparisons between methods are necessary to prune this growth to only the most accurate and robust methods. However, the nature of connectivity is elusive with different methods potentially attempting to identify different aspects of connectivity. Commonalities of connectivity definitions across methods upon which base direct comparisons can be difficult to derive. Here, we explicitly define “effective connectivity” using a common set of observation and state equations that are appropriate for three connectivity methods: dynamic causal modeling (DCM), multivariate autoregressive modeling (MAR), and switching linear dynamic systems for fMRI (sLDSf). In addition while deriving this set, we show how many other popular functional and effective connectivity methods are actually simplifications of these equations. We discuss implications of these connections for the practice of using one method to simulate data for another method. After mathematically connecting the three effective connectivity methods, simulated fMRI data with varying numbers of regions and task conditions is generated from the common equation. This simulated data explicitly contains the type of the connectivity that the three models were intended to identify. Each method is applied to the simulated data sets and the accuracy of parameter identification is analyzed. All methods perform above chance levels at identifying correct connectivity parameters. The sLDSf method was superior in parameter estimation accuracy to both DCM and MAR for all types of comparisons.

ContributorsSmith, Jason F. (Author) / Chen, Kewei (Author) / Pillai, Ajay S. (Author) / Horwitz, Barry (Author) / College of Liberal Arts and Sciences (Contributor)
Created2013-05-14
128472-Thumbnail Image.png
Description

A central goal of biology is to uncover the genetic basis for the origin of new phenotypes. A particularly effective approach is to examine the genomic architecture of species that have secondarily lost a phenotype with respect to their close relatives. In the eusocial Hymenoptera, queens and workers have divergent

A central goal of biology is to uncover the genetic basis for the origin of new phenotypes. A particularly effective approach is to examine the genomic architecture of species that have secondarily lost a phenotype with respect to their close relatives. In the eusocial Hymenoptera, queens and workers have divergent phenotypes that may be produced via either expression of alternative sets of caste-specific genes and pathways or differences in expression patterns of a shared set of multifunctional genes. To distinguish between these two hypotheses, we investigated how secondary loss of the worker phenotype in workerless ant social parasites impacted genome evolution across two independent origins of social parasitism in the ant genera Pogonomyrmex and Vollenhovia. We sequenced the genomes of three social parasites and their most-closely related eusocial host species and compared gene losses in social parasites with gene expression differences between host queens and workers. Virtually all annotated genes were expressed to some degree in both castes of the host, with most shifting in queen-worker bias across developmental stages. As a result, despite >1 My of divergence from the last common ancestor that had workers, the social parasites showed strikingly little evidence of gene loss, damaging mutations, or shifts in selection regime resulting from loss of the worker caste. This suggests that regulatory changes within a multifunctional genome, rather than sequence differences, have played a predominant role in the evolution of social parasitism, and perhaps also in the many gains and losses of phenotypes in the social insects.

ContributorsSmith, Chris R. (Author) / Helms Cahan, Sara (Author) / Kemena, Carsten (Author) / Brady, Sean G. (Author) / Yang, Wei (Author) / Bornberg-Bauer, Erich (Author) / Eriksson, Ti (Author) / Gadau, Juergen (Author) / Helmkampf, Martin (Author) / Gotzek, Dietrich (Author) / Okamoto Miyakawa, Misato (Author) / Suarez, Andrew V. (Author) / Mikheyev, Alexander (Author) / College of Liberal Arts and Sciences (Contributor)
Created2015-07-29
129068-Thumbnail Image.png
Description

Background: The discovery of genetic associations is an important factor in the understanding of human illness to derive disease pathways. Identifying multiple interacting genetic mutations associated with disease remains challenging in studying the etiology of complex diseases. And although recently new single nucleotide polymorphisms (SNPs) at genes implicated in immune response,

Background: The discovery of genetic associations is an important factor in the understanding of human illness to derive disease pathways. Identifying multiple interacting genetic mutations associated with disease remains challenging in studying the etiology of complex diseases. And although recently new single nucleotide polymorphisms (SNPs) at genes implicated in immune response, cholesterol/lipid metabolism, and cell membrane processes have been confirmed by genome-wide association studies (GWAS) to be associated with late-onset Alzheimer's disease (LOAD), a percentage of AD heritability continues to be unexplained. We try to find other genetic variants that may influence LOAD risk utilizing data mining methods.

Methods: Two different approaches were devised to select SNPs associated with LOAD in a publicly available GWAS data set consisting of three cohorts. In both approaches, single-locus analysis (logistic regression) was conducted to filter the data with a less conservative p-value than the Bonferroni threshold; this resulted in a subset of SNPs used next in multi-locus analysis (random forest (RF)). In the second approach, we took into account prior biological knowledge, and performed sample stratification and linkage disequilibrium (LD) in addition to logistic regression analysis to preselect loci to input into the RF classifier construction step.

Results: The first approach gave 199 SNPs mostly associated with genes in calcium signaling, cell adhesion, endocytosis, immune response, and synaptic function. These SNPs together with APOE and GAB2 SNPs formed a predictive subset for LOAD status with an average error of 9.8% using 10-fold cross validation (CV) in RF modeling. Nineteen variants in LD with ST5, TRPC1, ATG10, ANO3, NDUFA12, and NISCH respectively, genes linked directly or indirectly with neurobiology, were identified with the second approach. These variants were part of a model that included APOE and GAB2 SNPs to predict LOAD risk which produced a 10-fold CV average error of 17.5% in the classification modeling.

Conclusions: With the two proposed approaches, we identified a large subset of SNPs in genes mostly clustered around specific pathways/functions and a smaller set of SNPs, within or in proximity to five genes not previously reported, that may be relevant for the prediction/understanding of AD.

ContributorsBriones, Natalia (Author) / Dinu, Valentin (Author) / College of Health Solutions (Contributor)
Created2012-01-25
129066-Thumbnail Image.png
Description

Background: Glioblastoma is the most aggressive primary central nervous tumor and carries a very poor prognosis. Invasion precludes effective treatment and virtually assures tumor recurrence. In the current study, we applied analytical and bioinformatics approaches to identify a set of microRNAs (miRs) from several different human glioblastoma cell lines that exhibit

Background: Glioblastoma is the most aggressive primary central nervous tumor and carries a very poor prognosis. Invasion precludes effective treatment and virtually assures tumor recurrence. In the current study, we applied analytical and bioinformatics approaches to identify a set of microRNAs (miRs) from several different human glioblastoma cell lines that exhibit significant differential expression between migratory (edge) and migration-restricted (core) cell populations. The hypothesis of the study is that differential expression of miRs provides an epigenetic mechanism to drive cell migration and invasion.

Results: Our research data comprise gene expression values for a set of 805 human miRs collected from matched pairs of migratory and migration-restricted cell populations from seven different glioblastoma cell lines. We identified 62 down-regulated and 2 up-regulated miRs that exhibit significant differential expression in the migratory (edge) cell population compared to matched migration-restricted (core) cells. We then conducted target prediction and pathway enrichment analysis with these miRs to investigate potential associated gene and pathway targets. Several miRs in the list appear to directly target apoptosis related genes. The analysis identifies a set of genes that are predicted by 3 different algorithms, further emphasizing the potential validity of these miRs to promote glioblastoma.

Conclusions: The results of this study identify a set of miRs with potential for decreased expression in invasive glioblastoma cells. The verification of these miRs and their associated targeted proteins provides new insights for further investigation into therapeutic interventions. The methodological approaches employed here could be applied to the study of other diseases to provide biomedical researchers and clinicians with increased opportunities for therapeutic interventions.

ContributorsBradley, Barrie (Author) / Loftus, Joseph C. (Author) / Mielke, Clinton (Author) / Dinu, Valentin (Author) / College of Health Solutions (Contributor)
Created2014-01-18
128763-Thumbnail Image.png
Description

Purpose: PET (positron emission tomography) imaging researches of functional metabolism using fluorodeoxyglucose ([superscript 18]F-FDG) of animal brain are important in neuroscience studies. FDG-PET imaging studies are often performed on groups of rats, so it is desirable to establish an objective voxel-based statistical methodology for group data analysis.

Material and Methods: This study establishes

Purpose: PET (positron emission tomography) imaging researches of functional metabolism using fluorodeoxyglucose ([superscript 18]F-FDG) of animal brain are important in neuroscience studies. FDG-PET imaging studies are often performed on groups of rats, so it is desirable to establish an objective voxel-based statistical methodology for group data analysis.

Material and Methods: This study establishes a statistical parametric mapping (SPM) toolbox (plug-ins) named spmratIHEP for voxel-wise analysis of FDG-PET images of rat brain, in which an FDG-PET template and an intracranial mask image of rat brain in Paxinos & Watson space were constructed, and the default settings were modified according to features of rat brain. Compared to previous studies, our constructed rat brain template comprises not only the cerebrum and cerebellum, but also the whole olfactory bulb which made the later cognitive studies much more exhaustive. And with an intracranial mask image in the template space, the brain tissues of individuals could be extracted automatically. Moreover, an atlas space is used for anatomically labeling the functional findings in the Paxinos & Watson space. In order to standardize the template image with the atlas accurately, a synthetic FDG-PET image with six main anatomy structures is constructed from the atlas, which performs as a target image in the co-registration.

Results: The spatial normalization procedure is evaluated, by which the individual rat brain images could be standardized into the Paxinos & Watson space successfully and the intracranial tissues could also be extracted accurately. The practical usability of this toolbox is evaluated using FDG-PET functional images from rats with left side middle cerebral artery occlusion (MCAO) in comparison to normal control rats. And the two-sample t-test statistical result is almost related to the left side MCA.

Conclusion: We established a toolbox of SPM8 named spmratIHEP for voxel-wise analysis of FDG-PET images of rat brain.

ContributorsNie, Binbin (Author) / Liu, Hua (Author) / Chen, Kewei (Author) / Jiang, Xiaofeng (Author) / Shan, Baoci (Author) / College of Liberal Arts and Sciences (Contributor)
Created2014-09-26
128995-Thumbnail Image.png
Description

Background: Obesity is a metabolic disease caused by environmental and genetic factors. However, the epigenetic mechanisms of obesity are incompletely understood. The aim of our study was to investigate the role of skeletal muscle DNA methylation in combination with transcriptomic changes in obesity.

Results: Muscle biopsies were obtained basally from lean (n = 12; BMI = 23.4 ± 0.7

Background: Obesity is a metabolic disease caused by environmental and genetic factors. However, the epigenetic mechanisms of obesity are incompletely understood. The aim of our study was to investigate the role of skeletal muscle DNA methylation in combination with transcriptomic changes in obesity.

Results: Muscle biopsies were obtained basally from lean (n = 12; BMI = 23.4 ± 0.7 kg/m[superscript 2]) and obese (n = 10; BMI = 32.9 ± 0.7 kg/m[superscript 2]) participants in combination with euglycemic-hyperinsulinemic clamps to assess insulin sensitivity. We performed reduced representation bisulfite sequencing (RRBS) next-generation methylation and microarray analyses on DNA and RNA isolated from vastus lateralis muscle biopsies. There were 13,130 differentially methylated cytosines (DMC; uncorrected P < 0.05) that were altered in the promoter and untranslated (5' and 3'UTR) regions in the obese versus lean analysis. Microarray analysis revealed 99 probes that were significantly (corrected P < 0.05) altered. Of these, 12 genes (encompassing 22 methylation sites) demonstrated a negative relationship between gene expression and DNA methylation. Specifically, sorbin and SH3 domain containing 3 (SORBS3) which codes for the adapter protein vinexin was significantly decreased in gene expression (fold change −1.9) and had nine DMCs that were significantly increased in methylation in obesity (methylation differences ranged from 5.0 to 24.4 %). Moreover, differentially methylated region (DMR) analysis identified a region in the 5'UTR (Chr.8:22,423,530–22,423,569) of SORBS3 that was increased in methylation by 11.2 % in the obese group. The negative relationship observed between DNA methylation and gene expression for SORBS3 was validated by a site-specific sequencing approach, pyrosequencing, and qRT-PCR. Additionally, we performed transcription factor binding analysis and identified a number of transcription factors whose binding to the differentially methylated sites or region may contribute to obesity.

Conclusions: These results demonstrate that obesity alters the epigenome through DNA methylation and highlights novel transcriptomic changes in SORBS3 in skeletal muscle.

ContributorsDay, Samantha (Author) / Coletta, Rich (Author) / Kim, Joon Young (Author) / Campbell, Latoya (Author) / Benjamin, Tonya R. (Author) / Roust, Lori R. (Author) / De Filippis, Elena A. (Author) / Dinu, Valentin (Author) / Shaibi, Gabriel (Author) / Mandarino, Lawrence J. (Author) / Coletta, Dawn (Author) / College of Liberal Arts and Sciences (Contributor)
Created2016-07-18
128984-Thumbnail Image.png
Description

Background: Carriers of the APOE ε4 allele are at increased risk of developing Alzheimer’s disease (AD), and have been shown to have reduced cerebral metabolic rate of glucose (CMRgl) in the same brain areas frequently affected in AD. These individuals also exhibit reduced plasma levels of apolipoprotein E (apoE) attributed to

Background: Carriers of the APOE ε4 allele are at increased risk of developing Alzheimer’s disease (AD), and have been shown to have reduced cerebral metabolic rate of glucose (CMRgl) in the same brain areas frequently affected in AD. These individuals also exhibit reduced plasma levels of apolipoprotein E (apoE) attributed to a specific decrease in the apoE4 isoform as determined by quantification of individual apoE isoforms in APOE ε4 heterozygotes. Whether low plasma apoE levels are associated with structural and functional brain measurements and cognitive performance remains to be investigated.

Methods: Using quantitative mass spectrometry we quantified the plasma levels of total apoE and the individual apoE3 and apoE4 isoforms in 128 cognitively normal APOE ε3/ε4 individuals included in the Arizona APOE cohort. All included individuals had undergone extensive neuropsychological testing and 25 had in addition undergone FDG-PET and MRI to determine CMRgl and regional gray matter volume (GMV).

Results: Our results demonstrated higher apoE4 levels in females versus males and an age-dependent increase in the apoE3 isoform levels in females only. Importantly, a higher relative ratio of apoE4 over apoE3 was associated with GMV loss in the right posterior cingulate and with reduced CMRgl bilaterally in the anterior cingulate and in the right hippocampal area. Additional exploratory analysis revealed several negative associations between total plasma apoE, individual apoE isoform levels, GMV and CMRgl predominantly in the frontal, occipital and temporal areas. Finally, our results indicated only weak associations between apoE plasma levels and cognitive performance which further appear to be affected by sex.

Conclusions: Our study proposes a sex-dependent and age-dependent variation in plasma apoE isoform levels and concludes that peripheral apoE levels are associated with GMV, CMRgl and possibly cognitive performance in cognitively healthy individuals with a genetic predisposition to AD.

ContributorsNielsen, Henrietta M. (Author) / Chen, Kewei (Author) / Lee, Wendy (Author) / Chen, Yinghua (Author) / Bauer, Robert (Author) / Reiman, Eric (Author) / Caselli, Richard (Author) / Bu, Guojun (Author) / College of Liberal Arts and Sciences (Contributor)
Created2016-12-21
128640-Thumbnail Image.png
Description

Background: Our publication of the BitTorious portal [1] demonstrated the ability to create a privatized distributed data warehouse of sufficient magnitude for real-world bioinformatics studies using minimal changes to the standard BitTorrent tracker protocol. In this second phase, we release a new server-side specification to accept anonymous philantropic storage donations by

Background: Our publication of the BitTorious portal [1] demonstrated the ability to create a privatized distributed data warehouse of sufficient magnitude for real-world bioinformatics studies using minimal changes to the standard BitTorrent tracker protocol. In this second phase, we release a new server-side specification to accept anonymous philantropic storage donations by the general public, wherein a small portion of each user’s local disk may be used for archival of scientific data. We have implementated the server-side announcement and control portions of this BitTorrent extension into v3.0.0 of the BitTorious portal, upon which compatible clients may be built.

Results: Automated test cases for the BitTorious Volunteer extensions have been added to the portal’s v3.0.0 release, supporting validation of the “peer affinity” concept and announcement protocol introduced by this specification. Additionally, a separate reference implementation of affinity calculation has been provided in C++ for informaticians wishing to integrate into libtorrent-based projects.

Conclusions: The BitTorrent “affinity” extensions as provided in the BitTorious portal reference implementation allow data publishers to crowdsource the extreme storage prerequisites for research in “big data” fields. With sufficient awareness and adoption of BitTorious Volunteer-based clients by the general public, the BitTorious portal may be able to provide peta-scale storage resources to the scientific community at relatively insignificant financial cost.

ContributorsLee, Preston (Author) / Dinu, Valentin (Author) / College of Health Solutions (Contributor)
Created2015-11-04