Changed the parser gram_acct.py to extract something we can map to the “real” GRAM job identifier.
For the example lines:
JMA 2008/11/13 17:18:10 GATEKEEPER_JM_ID 2008-11-13.17:18:09.0000029844.0000000000 for /DC=org/DC=doegrids/OU=People/CN=James Matykiewicz 334896 on 64.240.154.5
JMA 2008/11/13 17:18:10 GATEKEEPER_JM_ID 2008-11-13.17:18:09.0000029844.0000000000 mapped to jmatykie (44581, 4002)
JMA 2008/11/13 17:18:10 GATEKEEPER_JM_ID 2008-11-13.17:18:09.0000029844.0000000000 has GRAM_SCRIPT_JOB_ID 9963417|/u/jmatykie/.globus/job/pdsfgrid4.nersc.gov/29889.1226625489/stdout|/u/jmatykie/.globus/job/pdsfgrid4.nersc.gov/29889.1226625489/stderr manager type sge
JMA 2008/11/13 17:24:55 GATEKEEPER_JM_ID 2008-11-13.17:18:09.0000029844.0000000000 JM exiting
produce the following netlogger event:
ts=2008-11-14T01:18:10.000000Z event=globus.acct.job level=Info DN="/DC=org/DC=doegrids/OU=People/CN=James Matykiewicz 334896" jm.id=0000029844.0000000000 group.id=2 host=64.240.154.5 user=jmatykie gram.id=29889.1226625489 user.id=44581 sched.id=9963417 sched.type=sge
Note that the “gram.id” was extracted from the filename for stdout! This seems to map, though, to the URL that the TechX broker is getting back, e.g.:
https://pdsfgrid4.nersc.gov:56393/29889/1226625489/
where in this example, the “gram.id” would be “29889.1226625489″.
To find the SGE id from the gram id in the database, do a lookup in the “ident” table for the desired gram id then join this with the ident table on the event (using e_id, the event identifier) to find the associated sched id:
select sched_id.value from ident gram_id join ident sched_id on gram_id.e_id = sched_id.e_id where gram_id.name = 'gram' and sched_id.name = 'sched' and gram_id.value = '29889.1226625489';
(Note: not tested, but should be very close if not perfect)
Here’s how I just parsed and loaded some gram logs on osp:
$ cd ~/netlogger-trunk
$ svn up
$ cd python
$ source ./dev-setup.sh
$ cd /opt/osg/var/2008.11
$ nl_parser -m gram_acct -e "\S+ \S+ \S+ " pdsfgrid4.nersc.gov.vdt.log > /tmp/ksb_gram.bp
$ nl_loader -u mysql://localhost -p user=me -p passwd=mysecret -i /tmp/ksb_gram.bp -p db=mydb -v -v
$ mysql -e "select count(*), name from mydb.event group by name;"
+----------+-----------------+
| count(*) | name |
+----------+-----------------+
| 734 | globus.acct.job |
| 10 | sge.job |
| 54 | sge.rpt.jl |
+----------+-----------------+
734 new events.
ksb
November 11th, 2008 in
CEDPS |
No Comments
A while back, Brian and I listed a bunch of queries that we’d like to be able to answer with the collected system log data, by querying the NetLogger database. I’ll repeat them here, then use the comments to follow up on experiences implementing them.
From a GOC admin:
- find log messages for jobs from VO=Atlas running at site=FNAL
- find log messages related to service=condor, user=Joe, site=Indiana
- find log messages for user=Joe
- find log messages with status=error
- find log messages which event=*authn* with status=error
- find log messages where the time between start/end events are more than 3X the baseline
- find log messages with start events with no matching end event
From a User (ie: all these relate to logs for the user DN):
- find log messages for all my jobs
- find log messages with status=error
- find log messages which event=*authn* with status=error
- find log messages where time intervals are more than 3X the baseline, where the baseline is computed from historical data in the log database
From a VO:
- what sites had connection attempts for a given user DN
- what data files were accessed most
- which user moved the largest amount of data in my VO
- find all logs where job manager status=killed (ie: jobs that were killed for running too long)
- which user submitted the most jobs (Gratia is better choice for this, but maybe it should be supported?)
From a site admin:
- what was the average GridFTP transfer speed on server=gridftp.lbl.gov
- what are the top 10 fastest/slowest sites receiving GridFTP transfers from my site
- what is the distribution of job run times on CE=myComputeElement
November 8th, 2008 in
CEDPS |
No Comments
This is a blog for NetLogger.
November 7th, 2008 in
Uncategorized |
No Comments