CS Document 2483-v1
Making Science in the Grid World: Using Glideins to Maximize Scientific Output
- Public document
- Many modern scientific collaborations need a lot of computing power. However, hosting and operating huge numbers of processing units is beyond the capability of most institutes, so the computers are being distributed among many locations and organized in a computer grid. While this allows sites to operate in an optimal mode, it does make the life of the average scientist much harder. Quite a bit of logic is necessary to optimally distribute the computing load among all the available sites, while avoiding resources that do not match the minimum requirements of the jobs.
Most VOs have thus rolled out some sort of Workload Management System (WMS) to maximize scientific output, while requiring minimal computer literacy. Condor glidein based WMS are presented here.
The Condor batch system is composed of several processes, distributed among many machines. The system is composed of start daemons for compute resources, schedulers for managing user jobs, and a central manager that glues everything together.
Glideins are regular grid jobs, each starting up a properly configured start daemon that connects back to a central manager, expanding the Condor batch pool. The WMS is resposible for glidein submission with the goal of maximizing usefull computer cycles, while minimizing wasted cycles.
The glideinWMS is a general purpose WMS developed CMS, but based on an idea pioneered by CDF:
keeping a stead preasure of glideins on all suitable sites. It is based on the Condor philosophy, splitting the WMS into two logical pieces; a set of glidein factories and a set of VO frontends that drive the factories. Condor tools are used to glue everything together.
The glideinWMS is now used by two HEP experiments, CDF and CMS, for both event simulation and data analysis. Other scientific communities could benefit the same from glidein-based WMS, either implementing their own services or using the glideinWMS.
- Publication Information:
- IEEE NSS 2007 conference