Minutes of July 14, 2003 CD Operations
- No new lost work days. A couple of walkthroughs.
- ITNAs need to be redone during the performance appraisal process.
- Data Handling: Last weeks dCache moved an average of 17 TB/day (85 Million
Events/day) with a peak rate of 23 TB (115 Million events) on Friday.
- CAF: over the last week the fraction of CPUs utilized has averaged to
around 80%, hitting flat-topped peaks of 90%. It is hard to get above 90%
because of the way the CAF is operated. There is a time delay between
the starting of multiple jobs in a single CAF submission, necessary to
avoid overwhelming the DB with simultaneous requests. At the beginning of
June only 55% of the CPU was being used, and has increased to 80% usage due
to activity for the summer conferences.
- The farms completed reprocessing 40 million events with new alignments and
returned to processing recent data. The farms processed about 20 million
events in the last 3 days, and are 30 million events behind the raw data
taking, which is coming in steadily at about 2.5 million events per day.
At a processing rate of about 7 million events/day, we should be caught up
again in about a week if we do not hit any operational problems.
- Procurement should be out by end of week. Has spoken to Phil Lutz.
- Large production should begin soon.
- No metrics today.
- Bad NIC on D0bbin. Bad power supply on TKA. Could be related to cooling.
- 100% of goals has been delivered.
- Reprocessing 12% of data per week.
- Starting to put together Minos control room on WH12NW.
- 1st draft of plan for underground Near Hall.
- Strange problems with dCache reported. Permission denied message.
- Problems with LSF on linux batch nodes, again Permission denied.
- Busy on performance appraisals.
- At 800 TB in Enstore. Moving data at teens of TB per day.
- Beginning migration of 1340 tapes for CDF.
- Discussoin of when Migration from A to B for D0 will start.
- Worked with operations on flipping tabs for tape recycling.
- D0's T9940B drives may be ready today.
- Theory out of tapes.
- Some networking downtimes scheduled for Thursday at 6 a.m.
- Met with CDF laison on CAF farm expansion for networking.
- 12th floor Minos control room installed hub.
- CDF B0 conference room work.
- FNALU preparing for 16 node expansion.
- Accomplishment reports.
- Default deny for web servers coming. Web servers to be visible off-site
need an exception. An announcement due out Wednesday.
- DOE has been doing scans of laboratories since January and will
get around to Fermilab soon. Recommend we scan our critical systems.
1: CSI - Core Support & Infrastructure
- OpenAFS migration is complete & accomplished with no noticeable
downtime for users!
- FNPRNT hung on Thursday, June 10th. Had to reboot the system.
- Jack worked with Public Affairs on new daily email message to be sent
to all employees starting July 21st.
2: SCS - Scientific Computing Support
- 16 Seagate 80G drives were replaced on D0 farm nodes (replaced with
Hitachi 80GB drives). 16 more will be replaced tomorrow morning. This is
a proactive/systematic replacement of drives due to excessive failures.
6 more of the Seagate drives failed over the weekend.
- CMS will be running production this week on hotdogs. This may
generate some helpdesk tickets.
3: ESS - Electronic Support Services
- ESS discovered a number of cracked (open) solder connections on
Pre-Amp daughter cards (calibration boards). This had the potential to
cause intermittent operational performance with calibration of the BPM
system. The cracked solder connections were likely caused during the
installation of the pressure fitted pin sockets by the board assembly
house. BD was notified of our findings and all calibration daughter
cards currently in-house (FCC) have been inspected for this defect.
4: TOC - Technical & Office Computing Support
TOC scanned web systems as suggested by computer security. TOC removed
Front Page Extensions on 3 systems out of 4 that it was on. On the 4th
system, FESSserver, where it is necessary, Ken Fidler made sure it was
John Urish is joining Experiment Support (Minos) August 1. He will
continue his responsibilities with Projects. As part of the transition
from TOC, he will continue effort towards the two up-coming conferences
Lepton-Photon and WINS.
5: CSS General
The CDO/APS & CSG groups 'helpdesk' are joining the CSS department. The
helpdesk functions intersect with many business processes in CSS so
there is clearly a natural fit. Rich Thompson's group and Julie Trumbo's
group will obviously work very closely on the support of Remedy product.
The effective date is meant to be August 1.
- 88% done with Goals. Rest this week.
- Working with visual media on Lepton-Photon posters. Solicting input
from the departments.
- Simulation report by Panagiotis Spentzouris:
Jim Simone is at lattice 2003, presenting physics simulation and
preliminary design of metadata format for lattice qcd. Steve Mrenna
at simulation workshop for LHC. Lynn Garren talking about generators
for LHC. Panagiotis and Daniel organizing workshop on GEANT4.
Panagiotis finished report from SCALES workshop: Scientific Case for
Large Scale Simulation, to provide Ray Orbach with input about next
round of funding. Discussion about relationship between SCALES & SCIDAC.
- Setting up to do radiation testing for SNAP.
- People going out to LBL for run 2B workshop.
- Asked for prioritization on beams profile monitoring system.
- Prototypes for the lattice gauge links, testing complete.
Planning and Customer Support
- Help Desk
- Problem resolution times of tickets, current open tickets and
degree of automation shown.
- 144 out of 251 employees have goal forms in the division office.
- List of systems in CD presented. Process accouning should be run
on each soon so that we can monitor CPU usage division wide.
- Converting to CVS. Working on getting scripts from Xenomedia working.
- Looking for pictures from departments for CD home page.
- Mike Stoltz and computer operators going to CCF august 1.
- Requisitions going through the system.
- Looking at facilities outside of FCC for computers for future years.
- Tours of FCC for Lepton-Photon coming up. Department representatives needed.