Review: BioTorrents. A File Sharing Service for Sharing Scientific Data

The article reviewed here is ‘BioTorrents: A File Sharing Service for Sharing Scientific Data’ and available at Plos One as an open access article. This is a brief article which outlines the features and potential benefits of BioTorrents and is written by the authors of the software.

There is a clear introduction which outlines why BioTorrents is needed. Essentially in a network, relying on a single node (or server/computer) to distribute data to all of the other nodes in the network is less efficient than using many nodes in the network to distribute that data. This BitTorrents approach works because there is a successful protocol which involves labelling the data so that nodes in the network ‘know’ what data the other nodes have. This approach was developed for BitTorrents initially and was used for distributing generic data. However due to copyright issues of the distributed material on these generic networks, servers were shut down and hence these networks were to some extent unreliable due to the pragmatics of their operation rather than the underlying technology. Hence the BioTorrents variation on the theme in which only legitimate scientific data is transferred across the network thus avoiding the reliability issues described above. The result is that BioTorrents is a useful approach for sharing large scientific datasets across networks and the authors cite large genomic datasets as an example of the data that can be usefully transferred in this way.

I wasn’t clear on the aims of the paper and this is perhaps reflected in the absence of a methodology section. In effect, I think the article is to some extent is a description of the journey of the authors from the identification of the needs through to the construction of BioTorrents and their subsequent observations of the BioTorrents in action. However the structure of the paper is a fairly minor point and instead the technology that is being presented is fascinating and extremely useful.

Screenshot from the website (http://www.biotorrents.net/)

I navigated to the site (see address above) and found it was well organised and easy to understand. There is a FAQ section and the data is hosted on a server at the laboratory of Jonathan Eisen a coauthor of the paper. At the time of writing I could identify 26 datasets. There were a few points I was unclear about. The first was about the ethical aspects of hosting data on these servers. For human data, research should pass through a research ethics committee and data storage is a point for clarification. There is usually an endpoint after which the data must be destroyed. Having data distributed in this way means that potentially the data cannot be destroyed as it may end up on a server somewhere indefinitely. Thus this approach might have implications for ethics protocols. There were two associations that sprang to mind when thinking about BioTorrents. The first was the Alzheimer’s Disease Neuroimaging Initiative which involves a large dataset that can be analysed by researchers from around the world and this might be well suited to BioTorrents. The second was the wayback machine which stores a small percentage of the ‘internet’ indefinitely and two questions I had were whether this would be included in the archived material*

In any case, this has potential applications for research in psychiatry providing such data storage methods have been given ethics clearance and collaborators are located at multiple sites either within LAN’s or at different geographical sites.

* or whether the BioTorrents approach was a useful alternative and distributed method for archiving material of historical interest – a kind of distributed system for storing the internet equivalent of world heritage sites.

Call for Authors: If you are interested in writing an article or series of articles for this blog please write to the e-mail address below. Copyright can be retained. Index: An index of the site can be found here. The page contains links to all of the articles in the blog in chronological order. Twitter: You can follow ‘The Amazing World of Psychiatry’ Twitter by clicking on this link. Podcast: You can listen to this post on Odiogo by clicking on this link (there may be a small delay between publishing of the blog article and the availability of the podcast). It is available for a limited period. TAWOP Channel: You can follow the TAWOP Channel on YouTube by clicking on this link. Responses: If you have any comments, you can leave them below or alternatively e-mail justinmarley17@yahoo.co.uk. Disclaimer: The comments made here represent the opinions of the author and do not represent the profession or any body/organisation. The comments made here are not meant as a source of medical advice and those seeking medical advice are advised to consult with their own doctor. The author is not responsible for the contents of any external sites that are linked to in this blog.

8 thoughts on “Review: BioTorrents. A File Sharing Service for Sharing Scientific Data

  1. Morgan Langille

    Thanks for your review of BioTorrents. With respect to your comments about certain human data having to be destroyed after a certain period of time, this type of data would not likely be shared on BioTorrents in the first place. BioTorrents is meant as a resource for sharing completely open data and not for datasets that have to be protected due to ethical constraints. Usually any scientific data that has to remain private is not openly available for download by the public and remains on secure local networks.

    Like

  2. Pingback: Inhealthcare.us Blogs — Blog — Get Your FREE iPad – US Only … | The Apple Ipad Blog

  3. Pingback: Most Tweeted Articles by Science Experts

  4. Dr Justin Marley Post author

    Hi, thanks for your comments. BioTorrents is a great resource. My comments about ethics related to the process of initially creating datasets for use with BioTorrents. Clinical researchers could consider this option during the planning stages of their studies as this opens up many possibilities and would need to be reflected in the arguments presented to the ethics committees. There is a need for discussion of this medium and open science in general in the wider clinical research community. Regards Justin

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s