Fermilab Computing Division

CS Document 2259-v1

CDF RunII Data Handling Design

Document #:
CS-doc-2259-v1
Document type:
Presentation
Submitted by:
Selitha Raja
Updated by:
Selitha Raja
Document Created:
27 Jun 2007, 12:53
Contents Revised:
27 Jun 2007, 12:53
Metadata Revised:
27 Jun 2007, 12:53
Viewable by:
  • Public document
Modifiable by:

Quick Links:
Latest Version

Abstract:
CDF is evolving their Run II Data Handling design to better manage multi-petabyte data samples in a global experiment. Starting with a largely centralized Data Handling system running on an SMP and well-established Data Handling interfaces, we have adapted our Data Handling system to use the Enstore mass storage system to enable reliable network access to peta-bytes of data files. We have integrated the use of the dCache product running on commodity file servers to provide optimal distributed access to CDF data samples in a variety of formats. To set the scale of their use, I/O rates in December 2002 varied between 3-7 TB/day for the CDF Enstore system and 5-15 TB/day for the CDF dCache system. We are adapting our Data Handling system to utilize the SAM product as well, which supports the generation of and access to globally distributed data samples in a single consistent framework. The Data Handling design built on these products is described which supports a variety of access and generation patterns by globally distributed experimenters and multi-petabyte Run II data volumes.
Files in Document:
Associated with Events:
CHEP2003 held on 24 Mar 2003 in La Jolla, California
DocDB Home ]  [ Search ] [ Authors ] [ Events ] [ Topics ]

DocDB Version 8.8.9, contact Document Database Administrators