Grid2003 Monitoring, Metrics, and Information Systems

Marcia A Teckenbrock
Marcia A Teckenbrock
20 Sep 2004, 15:08
28 Sep 2004, 02:07
20 Jan 2005, 09:32
  • Public document
20 Sep 2004, 18:34
This paper describes the design and implementation of the Grid3 monitoring infrastracture. The grid3 monitoring architecture follows a user-oriented design that uses different underlying monitoring tools to build a very diversified framework. We use both existing tools and extensions developed as part of the Grid2003 project. The main tools used include ACDC Job Monitoring from University of Buffalo, Ganglia, a Grid Catalog developed as part of
Grid2003, Globus MDS, the University of Chicago Grid telemetry MDViewer, and US CMS MonALISA. ACDC Job Monitoring collects job information like running and queued jobs, CPU usage. Ganglia is collecting host information like CPU, disk and network load. The Grid Catalog is summarizing the status of each Site and tests periodically to see if the main resources are working. The Globus MDS stores mainly configuration information. MDViewer allows analysis and plotting of thehistorical information
collected. MonALISA provides a framework for collecting information from different sources, allows distributed queries to the system and provides a central repository.

From the collected data we extractinformation of interest for the VOs participating in the Grid, like resources provided and used by all VOs and status of the resources
This paper furthermore points out issues solved during the deployment like scalability and dealing with limited control on the software installed at the sites.

Fermilab Publication number CONF-04-458-CD
CHEP2004 held from 27 Sep 2004 to 01 Oct 2004 in Interlaken, Switzerland
