Experts have warned that processing of genetic data will soon exceed the computing needs of Twitter and YouTube. This is due to the drop of the costs for sequencing DNA of any living creature and its huge impact in many application areas. Designing suitable network architectures for distributing such data is therefore of paramount importance. Management of genomic data sets is a typical big data problem, characterized not only by a huge volume, but also by the large size of each genomic file. Since it is unthinkable that any professional who needs to process genomes can own the infrastructure for massive genome analysis, a cloud-based access to genomic services is envisaged. This will have a significant impact on the underlying networks, which could become the system bottleneck. In this paper, we propose Genome Centric Networking (GCN), a novel network function virtualization framework for cloud-based genomic data management, designed with the aim of limiting the exchanged traffic by using distributed caching. The key element of GCN is a novel signaling protocol, which allows both discovering network resources and managing caches. We evaluated GCN on a real testbed. GCN allows halving the exchanged traffic and reducing the transfer time of genomic datasets significantly.

Genome centric networking: A network function virtualization solution for genomic applications

Femminella, Mauro
Methodology
;
Reali, Gianluca
Writing – Review & Editing
;
2017

Abstract

Experts have warned that processing of genetic data will soon exceed the computing needs of Twitter and YouTube. This is due to the drop of the costs for sequencing DNA of any living creature and its huge impact in many application areas. Designing suitable network architectures for distributing such data is therefore of paramount importance. Management of genomic data sets is a typical big data problem, characterized not only by a huge volume, but also by the large size of each genomic file. Since it is unthinkable that any professional who needs to process genomes can own the infrastructure for massive genome analysis, a cloud-based access to genomic services is envisaged. This will have a significant impact on the underlying networks, which could become the system bottleneck. In this paper, we propose Genome Centric Networking (GCN), a novel network function virtualization framework for cloud-based genomic data management, designed with the aim of limiting the exchanged traffic by using distributed caching. The key element of GCN is a novel signaling protocol, which allows both discovering network resources and managing caches. We evaluated GCN on a real testbed. GCN allows halving the exchanged traffic and reducing the transfer time of genomic datasets significantly.
2017
9781509060085
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11391/1423036
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 3
social impact