Quarterly Report for the Distributed Monitoring Framework (DMF) project, July 2002

Progress: April 2002 to July 2002

The focus of the DMF project for FY02 was to lead efforts in the HEP and GGF communities to define requirements for a Grid monitoring system, to finish a prototype GMA implementation, to improve the performance and fault tolerance of NetLogger, and to design and implement a prototype monitoring event archive. More details on each of these topics follow.

 Brian Tierney continued co-leading the "Glue Schema Work Group", which is tasked to define common schemas for inter operability between the EU physics grid projects (focusing on EDG and DataTag) and the US physics Grid projects (focusing in on PPDG, GriPhyN and iVDGL). The web page for this project is http://www.hicb.org/glue/glue-schema/schema.htm. This work is part of the Grid Laboratory Uniform Environment (GLUE) Phase I task (http://www.hicb.org/glue/GLUE-v0.04.doc).

The Glue Schema Group has now completed defining the schema for a Compute Elements (CE), and has now starting working on the Storage Element (SE), and then Network Elements (NE). The Globus MDS and R-GMA Grid Information Services will use these schemas. The goal is to have common schemas defined, deployed, and tested in time for the EU DataGrid Testbed 2 release in September 2002. Common schemas for monitoring and notification events are being address by the Global Grid Forum DAMED working group, and will be addressed later by iVDGL and DataTag, and the GLUE work will likely be moved into the GGF to get feedback from a wider community. For more information see the PPDG quarterly report.

We continued further implementation of a prototype event archive for monitoring data, based on an open source relational database (mySQL). This is described in a paper submitted to and accepted for the IEEE supercomputing conference. (see http://www-didc.lbl.gov/papers/Monitoring-archive-SC02-extended-abstract.pdf) The archive also supports the GMA consumer and producer interfaces. We also continued development of a web-based interface to the event archive.

We have now released “NetLogger2”, which included many improvements to NetLogger. Details are in the last quarterly report, and are also described in a paper accepted to the 2002 High Performance Distributed Computing Conference (see: http://www-didc.lbl.gov/papers/HPDC02-HP-monitoring.pdf).

We are continuing to work on adding a "trigger interface" to NetLogger, and wrote a GMA-based "activation service" to set the trigger. This works as follows: a consumer sends a request to an activation service to start monitoring a particular type of event. The activation service creates an entry in a "trigger file". The application, via NetLogger library calls, periodically checks for updated trigger files, and starts logging the specified events. The activation service buffers, filters, and forwards the requested event data back to the consumer. We are working with the Atlas software group to add this to their Athena framework.

We were invited to give numerous talks and tutorials on various aspects of our work. This includes several NetLogger tutorials, and talks at a Python conference, the Internet 2 End-to-End Monitoring Workshop, and the ESNet ESCC meeting.

We continue to be extremely active in the Global Grid Forum. Dan Gunter is co-chair of the GMA working Group, and Brian Tierney is co-chair of the Network measurements working group. Both of these working groups have new documents that will be discussed at the July GGF. We are also involved with several other groups, including Event Schema, Remote IO, Information Services, Network Research, Architecture, and Event Notification groups.

We continue to collaborate with several groups, including NLANR, EU DataGrid, Globus, and the IEPM project at SLAC on the possible use of NetLogger to collect monitoring data for their projects.

We have also been working closely with the Globus project to define instrumentation and monitoring services for Globus and the new "Open Grid Services Architecture" (OGSA), based on NetLogger.

We worked with several groups to help add NetLogger instrumentation to their software. This quarter this includes Globus (ANL) and Atlas Athena software.