CS Document 3061-v2
Benchmark comparison for glidein factory monitoring between v1_6_beta_1 and v2_0_beta_2 (RRD files and locking)
- Public document
- The glidein factory monitoring in v1_X branch is known to be very resource intensive, in particular regarding disk IO operations. The scalability tests performed in Spring 2008 showed that it could not sustain the tested 100 entry points without some kind of resource handling. The solution in the recent v1_X releases has been to implement an internal locking mechanism that reduced the disk IO requests, but also slowed down the responsiveness of the glidein factory.
A different approach is being pursued in the current development branch. The problem of excessive IO operations was assumed to be the large number of RRD files; the v1_X branch puts a single value per RRD file. The development branch instead packs many variables into the RRDs, drastically reducing the number of RRD files (from 131 RRDs per client in v1_X to 9 RRDs per client in the development branch), while preserving the same information.
To validate the above assumptions, I benchmarked v1_6_beta_1 and v2_0_beta_2 cvs tags of glideinWMS. The results are presented in the document.