ES&H 496 days w/o lost workday
(ballpark calc: 500 days * 200 staff * 10 hrs/day = 1,000,000
CDF working on power info request
Many workers on online rate problem to final storage. Not clear
who has lead. Expect it somewhere in CDF.
CMS working procurement plans. Power estimates w/in 2 weeks.
D0 Last week farm yield was lower than expected. Why? Investigating
luminosity ? p14 going in and should have better
performance for high lum events.
50% utilization of CAB with significant med queue load. Suspect
it is spot MC use. Likely to be conference driven and
to ramp up soon.
Expect large use of pick events due to various problems (code,
data, ...) Will stress system, particularly the db
servers. Expect stress to be driven by lepton/photon.
EAG Windows Server 2003 install went well. Database guy here this week
Working plans to move file servers to FCC.
ES Minos instructed on how to use dCache. May see burst of activity.
Minos specific FNALU batch instructions also newly made. May see
increase of use.
CCF 17TB tape movement day sets new record
10TB/day sustained rate in dCache pilot also good news
Large SAM volume seen in CDF shows that ramping up (2TB day
At least 1 5 minute interval of 300+ Mbps offsite network usage
(50% of link ?) Raises question of steady to peak specs.
Working updates to CDF drives. No new issues.
D0 fiber to desktop plans being developed.
Most material for CDF conference rooms have arrived. Beginning
Blocked on miniBoone WH control room install pending current conf.
room gear removal.
NetBIOS attacks have ramped up as expected. 2 compromises of
unblocked machines. Followup with inspections of the rest.
CSS OPS Report 2003-05-19
PC Mangers meeting - Wednesday, May 28th, 9-10am, FCC2A/B
Operations meeting with SGI - Tuesday, May 20, 9:30am, FCC2A
1: ES&H - Environmental, Safety & Health
1.1 APC power strips
Bruce Karrels is inspecting and repairing (as necessary) the APC power
strips used in some of the Atipa Farms racks. He has found two line cord
connection assemblies that have overheated (d0cs017-176).
2: ELS - Equipment Logistics Services
2.1 SGI O2000 hw/sw maintenance contract renewal
The requisitions for the renewal are in the MISER. With the
decommissioning of part of d0mino (128cpus & 1 meta router), the
division has reduced the maintenance charges ~ $70k off what was
3: ESS - Electronics Support Services
3.1 Beams - Beam Position Monitor
209 Pre-Amps are modified, excluding installation of daughter cards
(calibration board). 66 additional pre-amps will need to be modified to
meet the total quantity required of 275. 50 EchoTek boards received
today for initial testing by ESS.
4: DSG - Database Systems Group
4.1 CDF oracle machines
N. Stanfield - cdfrep01 (fcdflnx1) was upgraded to 18.104.22.168 per last
security alert. cdfonprd (b0dau36) is scheduled for upgrade on Tuesday
May 20. This is the last database that needs the patch. The OEM (oracle
enterprise manager) pc's were also upgraded to 9203 as well.
N. Stanfield rebuilt a new composite index on cdfonprd (b0dau35) that
interrupted Icicle but it solved their query problem.
4.2 MMS interface
S. Jones fixed a bug in the mmsInterface.pl code that was sticking
carriage returns and spaces into requisition lines. This prevented a
purchase order from being created on the mms side.
5: SCS - Scientific Computing Support
5.1 D0 Farms
SCS is seeing an increase of disk errors on new D0 farms after upgrade.
They think this because they changed the partitions on data disk and are
using areas not previously used. They are actively scanning these areas
for bad block errors, but marginal ones are not being detected by the
diagnostic activity. They are investigating better methods of testing
5.1 SUN Technology Meeting
Sun has asked to come out and update us on new technology. They
want to present a roadmap and also talk about throughput computing. The
presentation will be held in fcc2b on the Wednesday May 21st at 1:30.
Mail Stan for details or questions
6: CSI - Core Server and Infrastructure
Info4 and info1 had problems replicating newsgroups Monday and Tuesday.
Rebooting BOTH servers fixed the problem.
6.2 Mail Lists
Robert Kennedy attempted to post the same message to many CDF lists. His
message was marked as SPAM by Listserv and he was added to the LISTSERV
internal restricted sender group. A workaround was used so Robert could
7: TOC - Technical and Office Computing Support
7.1 Computer Security
TOC checked their exempted servers for files that would have confirmed a
similar break-in to the Beams Div server. No files found and Computer
TOC was also advised of unusual offsite traffic to CSDDEV over last
weekend. The system had been hacked. The system was taken offline and
Computer Security advised. The system was rebuilt from scratch and all
SP and patches applied and now will be put into the Fermi W2K domain.
The CSDserver1 upgrade to W2K is scheduled for 5-31.
Did we get SGI support to resolve D0 fileserver trouble.
D0 happy. Problem not reoccured since last maint.
Has D0 L3 been alerted to the power dist. problems seen in
D0 offline farm ? Could have same trouble. Amber will followup.
Lots of dbserver activity and going well.
looking for pilot users for disk (?) side caching.
JIM deployment started last week and tests between
ClueD0 and CAB are underway.
Failure mode of farms power distribution mode understood.
Effort reports due next Tuesday.
Photos solicited for new CD homepage
Zeno media working on rollover problem.
Wed 1pm project status meeting will focus on new webpages.
Crating of CDF ADIC underway and ahead of schedule.
UPS upgrade request is in to DOE. Broken in to 2 years.
Power upgrade in FY03 request.
Phase 2 capacity planning report given last Friday. Will
make available to interested CD parties. Ask Gerry.
The official number on the Lab Safety Committee is 660,701
hours; PPD is
leading with 1,082,344. Looks like we've got another 250 or so days to go to
break 1,000,000, by their calculation.
-- Mark K.