CS Document 4501-v3
Supporting Shared Resource Usage for a Diverse User Community: the OSG experience and lessons learned
- Public document
- The Open Science Grid (OSG) supports a diverse community of new and existing users to adopt and make effective use of the Distributed High Throughput Computing (DHTC) model. The LHC user community has deep local support within the experiments. For other smaller communities and individual users the OSG provides a suite of consulting and technical services through the User Support organization. We describe these sometimes successful and sometimes not so successful experiences and analyze lessons learned that are helping us improve our services. The services offered include forums to enable shared learning and mutual support, tutorials and documentation for new technology, and troubleshooting of problematic or systemic failure modes. For new communities and users, we bootstrap their use of the distributed high throughput computing technologies and resources available on the OSG by following a phased approach. We first adapt the application and run a small production campaign on a subset of "friendly" sites. Only then we move the user to run full production campaigns across the many remote sites on the OSG, where they face new hindrances including no determinism in the time to job completion, diverse errors due to the heterogeneity of the configurations and environments, lack of support for direct login to troubleshoot application crashes, etc. We cover recent experiences with image simulation for the Large Survey Synoptic Telescope (LSST), small-file large volume data movement for the Dark Energy Survey (DES), civil engineering simulation with the Network for Earthquake Engineering Simulation (NEES), and accelerator modeling with the Electron Ion Collider group at BNL. We will categorize and analyze the use cases and describe how our processes are evolving based on lessons learned.
- Associated with Events:
- CHEP 2012 held on 21 May 2012 in New York, New York