CS Document 1342-v1
CHEP06: SAMGRID Peer to Peer Information Service
- Document #:
- CS-doc-1342-v1
- Document type:
- Conference
- Submitted by:
- Stuart C. Fuess
- Updated by:
- Stuart C. Fuess
- Document Created:
- 09 Feb 2006, 13:40
- Contents Revised:
- 09 Feb 2006, 13:40
- Metadata Revised:
- 19 Oct 2006, 15:53
- Abstract:
- SAMGrid presently relies on the centralized database for providing several services vital for the system operation. These services are all encapsulated in the SAMGrid
Database Server, and include access to file metadata and replica catalogs, dataset and processing bookkeeping, as well as the runtime support for the SAMGrid station
services. Access to the centralized database and DB Servers represents a single point of failure in the system and limits its scalability. In order to address this issue, we have created a prototype of a peer-to-peer information service that allows the system to operate during times when access to the central DB is not available for any reason (e.g., network failures, scheduled downtimes, etc.), as well as to improve the system performance during times of
extremely high system load when the central DB access is slow and/or has a high failure rate. Our prototype uses Distributed Hash Tables to create a fault tolerant
and self-healing service. We believe that this is the first peer-to-peer information service designed to become a part of an in-use grid system. We describe here the prototype architecture and its existing and planned functionality, as well as show how it can be integrated into the SAMGrid system. We also present a study of performance of our new service under different circumstances. Our results strongly demonstrate the easibility and usefulness of the proposed architecture.
- Authors:
- Associated with Events:
- CHEP2006 held from 13 Feb 2006 to 17 Feb 2006 in Mumbai, India