CONNECT Magazine Issue 18

Page 33

OPEN CALL – SOFTWARE DEFINED NETWORKING

Fig. 1. ARES: a big picture.

Processing services, implemented in VMs, are usually organized in "pipelines". The main novelty of ARES consists of network data management, optimized for genomics processing needs. It is based on caching of both VMs and large auxiliary files (like reference human genomes ), required by genomic processing. For selecting the data center that will execute the computation, the GCM collects information, about data center availability and cached contents, through specific NSIS signaling. The policies used for allocating computing resources are implemented in the GCM, that executes ad hoc optimization problems, involving content location (both original and cached ones), data centers with sufficient resources, and the location of the genomes to be processed. The optimization problem consists of minimizing either exchanged network traffic or processing time. The relevant solutions indicate the locations from which contents are downloaded and the data center which will host the desired processing. The GCM can thus orchestrate data transfer and computation services. In this way, it is possible to tightly control the impact of the genomic services being delivered over network resources.

III. EXPERIMENTAL RESULTS

Experiments in ARES have been done by referring to the GÉANT topology, composed of 40 points of presence (PoPs, January 2014), 32 of which include a co-located data center, candidate for hosting a processing VM. In the set of experiments shown in what follows, up to five concurrent pipelines can be executed in each data center. In the legacy approach requests are distributed randomly over nodes. In the ARES strategy, service requests are allocated by following the approach described in sec. II, for example minimizing the traffic exchanged, including not only the patient genomes, but also VM images and auxiliary files, which can be stored in the ARES network caches. The sample results shown in IIIdemonstrate a marked decrease, of about 6 times, of the aggregate traffic.

Fig. 2. Benefit introduced by the ARES approach versus a randomized data center selection, as a function of service request rate.

ACKNOWLEDGMENT

ARES is supported by GÉANT/GN3plus in the framework of the first GÉANT open call.

REFERENCES

[1] DNA Sequencing Costs. Data from the National Human Genome Research Institute (NHGRI) Genome Sequencing Program (GSP). http://www.genome.gov/sequencingcosts/> <h [2] Yandell M., Ence D., "A beginner’s guide to eukaryotic genome annotation," Nature Reviews Genetics, 13(5), 2012. [3] O’Driscoll A, Daugelaite J., Sleator R.D. "‘Big data’, Hadoop and cloud computing in genomics,” Journal of Biomedical Informatics, 2012, vol. 46, pp. 774–781. [4] EE Schadt et al., "Computational solutions to large-scale data management and analysis," Nature Reviews Genetics, September 2010, 11(9), pp. 647-657. [5] "Network Functions Virtualisation (NFV) Network Operator Perspectives on Industry Progress", ETSI, October 2013, http://portal.etsi.org/NFV/NFV_White_Paper2.pdf. [6] Femminella M., Francescangeli R., Reali G., Lee J.W., Schulzrinne H., "An enabling platform for autonomic management of the future internet," IEEE Network, 25(6), 2011, pp. 24-32. [7] Femminella M., Francescangeli R., Reali G., Schulzrinne H., "Gossipbased signaling dissemination extension for next steps in signaling," IEEE/IFIP NOMS 2012, Maui, US.

For more information on GÉANT Open Call see www.geant.net/opencall 31


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.