Demonstrating Big Data Transfer at Supercomputing 2015

Modern research often requires the use of specialized scientific instruments (which may be located thousands of miles away from the researcher in question). Using those instruments can create large volumes of data that must then be shipped back to the researcher, but that's easier said than done. For very large files, the fastest way of getting the data to the researcher may be to physically ship a hard drive. Long distance digital file-sharing tools do exist, of course, but they operate over standard Internet protocols (TCP/IP), which can hinder their performance. In contrast, Inifiniband is a technology that allows for very high-speed data transfer, but it was developed for use over very short distances (either within a rack, or at most, within a data centre). But what if you want to take advantage of the same kind of high speed transfer over large distances? Obsidian, an Edmonton-based company, has developed a product called the Longbow that allows this to happen using the Infiniband technology.

High speed file transfer in progress: 928MB/s!
High speed file transfer in progress: 928MB/s!

Two weeks ago at the Supercomputing 2015 (SC15) conference, Obsidian collaborated with the University of Alberta to demonstrate its technology using Canada’s National Research and Education Network (NREN). The conference, attended by 10,000 experts from around the world, showcased work in high performance computing, networking, storage and analysis by the international community.

For its demonstration, Obsidian installed a Longbow device at the University of Alberta’s data centre in Edmonton and another on the SC15 showroom floor in Austin, Texas. Meanwhile, Cybera, CANARIE,Compute Canada and iCAIR set up a high speed (10Gbit/s) network connection between Edmonton and the Obsidian booth at SC15. Large volumes of genetics research data from the University (see below) were then transferred at a rate of 928 MB/s (7.4 Gigabits/s) over the research network to the showroom floor (this is around 250 times faster than Canada’s average internet speed). That means 350 GB of data could be transferred thousands of miles in less than six minutes — pretty impressive.

This demonstration shows the potential of using Infiniband over high speed Wide Area Networks as one possible solution to the data transfer problem faced by an increasing number of researchers.

BACKGROUND

Dr. Paul Lu and Dr. Juan Jovel explain the genetics research that was transferred at SC15:

resizedimage300225-DNA-small“

Scientists study bacteria populations in the guts of patients (i.e., gut microbiome) because specific bacteria are associated with a variety of physiological and medical conditions, including obesity, inflammatory bowel disease, and type II diabetes.

 Next generation sequencing machines, in which DNA or cDNA copies of RNA are sequenced with ever-increasing speed and throughput, and ever-decreasing costs, generate sequence data of the specific bacteria from a patient's sample.

For large datasets and multiple patients, the sequence files can be large (around 50 GB). 

Frequently we have to transfer the sequence data from where it originates to a compute centre. The datasets of known sequences, often larger than 500 Gb, may have to be transferred as well.

Characterizing the gut microbiome of many patients, and associating them with physiological and medical data, can increase our understanding of the role of the microbiome in fundamental processes related to health and disease.

 Ultimately, the goal is to develop the scientific data and theories surrounding gut microbiomes to the point where sequencing samples from a particular patient, like various forms of spectroscopy in physics and chemistry, can be used to aid in diagnosis.”