The CMS Dataset Bookkeeping Service

Lee Lueking
Lee Lueking
27 Mar 2007, 13:54
27 Mar 2007, 13:54
27 Mar 2007, 13:58
  • Public document
The CMS Dataset Bookkeeping Service (DBS) has been developed to catalog all CMS
event data from Monte Carlo and Detector sources. It includes the ability to identify MC
or trigger source, track data provenance, construct datasets for analysis, and discover
interesting data. CMS requires processing and analysis activities at various service levels
and the system provides support for localized processing or private analysis, as well as
global access for CMS users at large. Catalog entries can be moved among the various
service levels with a simple set of migration tools, thus forming a loose federation of
databases. DBS is available to CMS users via a Python API, Command Line, and a
Discovery web page interfaces. The system is built as a multi-tier web application with
Java servlets running under Tomcat, with connecting via JDBC to Oracle or MySQL
database backends. Clients connect to the service through HTTP or HTTPs with
authentication provided by GRID certificates and authorization through VOMS. DBS is
an integral part of the overall CMS Data Management and Workflow Management
systems. The system has been in operation since March 2007, an overview of the schema,
functionality, deployment details, operational statistics and experience will be presented.
