Fermilab Computing Division

CS Document 5430-v8

Engineering the CernVM-FileSystem as a High Bandwidth Distributed Filesystem for Auxiliary Physics Data

Document #:
CS-doc-5430-v8
Document type:
Conference
Submitted by:
Gabriele Garzoglio
Updated by:
Gabriele Garzoglio
Document Created:
16 Oct 2014, 09:09
Contents Revised:
15 May 2015, 15:26
Metadata Revised:
15 May 2015, 15:26
Viewable by:
  • Public document
Modifiable by:

Quick Links:
Latest Version

Other Versions:
CS-doc-5430-v7
15 May 2015, 15:17
CS-doc-5430-v6
14 May 2015, 15:53
CS-doc-5430-v5
14 May 2015, 15:46
CS-doc-5430-v4
14 May 2015, 13:45
CS-doc-5430-v3
14 May 2015, 11:56
CS-doc-5430-v2
08 May 2015, 16:00
CS-doc-5430-v1
07 May 2015, 16:34
CS-doc-5430-v0
16 Oct 2014, 13:41
Abstract:
A common use pattern in the computing models of particle physics experiments is running many distributed applications that read from a shared set of data files. We refer to this data is auxiliary data, to distinguish it from (a) event data from the detector (which tends to be different for every job), and (b) conditions data about the detector (which tends to be the same for each job in a batch of jobs). Relatively speaking, conditions data also tends to be relatively small per job where both event data and auxiliary data are larger per job. Unlike event data, auxiliary data comes from a limited working set of shared files. Since there is spatial locality of the auxiliary data access, the use case appears to be identical to that of the CernVM-Filesystem (CVMFS). However, we show that distributing auxiliary data through CVMFS causes the existing CVMFS infrastructure to perform poorly. We utilize a CVMFS client feature called "alien cache" to cache data on existing local high-bandwidth data servers that were engineered for storing event data. This cache is shared between the worker nodes at a site and replaces caching CVMFS files on both the worker node local disks and on the site’s local squids. We have tested this alien cache with the dCache NFSv4.1 interface, Lustre, and the Hadoop Distributed File System (HDFS) FUSE interface, and measured performance. In addition, we use high-bandwidth data servers at central sites to perform the CVMFS Stratum 1 function instead of the low-bandwidth web servers deployed for the CVMFS software distribution function. We have tested this using the dCache HTTP interface. As a result, we have a design for an end-to-end high-bandwidth distributed caching read-only filesystem, using existing client software already widely deployed to grid worker nodes and existing file servers already widely installed at grid sites. Files are published in a central place and are soon available on demand throughout the grid and cached locally on the site with a convenient POSIX interface. This paper discusses the details of the architecture and reports performance measurements.
Files in Document:
  • PDF (CHEP15_Paper_CVMFSAuxData.pdf, 515.4 kB)
  • Word (CHEP15_Paper_CVMFSAuxData.doc, 368.5 kB)
Keywords:
CHEP15 CVMFS GCS
Associated with Events:
held on 13 Apr 2015
DocDB Home ]  [ Search ] [ Authors ] [ Events ] [ Topics ]

DocDB Version 8.8.9, contact Document Database Administrators