Fermilab Computing Division

CS Document 6170-v1

High Energy Physics Data Science Toolkit Development

Document #:
Document type:
Technical Note
Submitted by:
Marc Paterno
Updated by:
Marc Paterno
Document Created:
11 Sep 2017, 11:00
Contents Revised:
11 Sep 2017, 11:00
Metadata Revised:
11 Sep 2017, 11:00
Viewable by:
  • Public document
Modifiable by:

Quick Links:
Latest Version

A typical High Energy Physics (HEP) experiment has several categories of computing workflow: production of simulation, collection of real data, physics object identification, and many unique end-user analyses. Even the smaller running and imminent experiments will have total collected and simulated data volumes in the 10 TiB to ∼ PiB range, with the attendant challenges for processing CPU demands, IO, networking and storage. Simulated data are usually at least ten times the volume of collected data, and both are processed several times to achieve publication quality physics objects. End-user analyses are run hundreds to thousands of times using these physics objects over the lifetime of an experiment.

By using new-to-the-field HPC facilities and technologies to carry out specific representative HEP tasks, we intend to develop tools and libraries which will allow these tasks to be carried out with maximum efficient use of HPC resources and state-of-the-art data science technologies, and thereby evolve the field’s traditional high-throughput computing (HTC) model going forward.

Files in Document:
DocDB Home ]  [ Search ] [ Authors ] [ Events ] [ Topics ]

DocDB Version 8.8.9, contact Document Database Administrators