ICALEPCS 2009
TUP001
Monitoring the LHCb Experiment Computing Infrastructure with NAGIOS
E.Bonaccorsi*, N.Neufeld (CERN)
LHCb has a large and complex infrastructure consisting of thousands of servers and embedded computers, hundreds of network devices and a lot of shared infrastructure services such as shared storage, login and time services, databases and many more. All operationallly critical aspects are integrated into the standard Experiment Control System based on PVSSII. This enables non-expert operators to do first-line reactions. At the lower level and in particular for monitoring the infrastructure the Control System itself depends on a secondary infrastructure based on the industry standard NAGIOS has been put in place. We present the design and implementation of the fabric management based on NAGIOS. Care has been taken to complement rather than duplicate functionality available in the Experiment Control System.