"SAMGrid Experiences with the Condor Technology in Run II Computing"

JoAnn Larson
JoAnn Larson
05 Aug 2004
23 Sep 2004
20 Jan 2005
SAMGrid is a globally distributed system for data handling and job management, developed at Fermilab for the D0 and CDF experiments in Run II. The Condor system is being developed at the University of Wisconsin for management
of distributed resources, computational and otherwise. We briefly review the SAMGrid architecture and its interaction with Condor, which was presented earlier. We then present our experiences using the system in production,
which have two distinct aspects.
which have two distinct aspects.

At the global level, we deployed Condor-G, the Grid-extended Condor, for the resource brokering and global scheduling of our jobs. At the heart of the system is Condor's Matchmaking Service. As a more recent work at the
computing element level, we have been benefitting from the large computing cluster at the University of Wisconsin campus. The architecture of the computing facility and the philosophy of Condor's resource management have prompted us to improve the application infrastructure for D0 and CDF,
in aspects such as parting with the shared file system or reliance on resources being dedicated. As a result, we have increased productivity and made our applications more portable and Grid-ready. We include some statistics gathered from our experience. Our fruitful collaboration
with the Condor team has been made possible by the Particle Physics Data Grid.

Fermilab Publication number CONF-04-470-CD
Associated with Events:
CHEP2004 held from 27 Sep 2004 to 01 Oct 2004 in Interlaken, Switzerland
