This page last changed on Nov 23, 2010 by kskaburs.

WMS Probe

WMS-probe.

Test WMS service with job submission to predefined CEs.

WMS Metrics

Name Description
org.sam.WMS-JobState Submits grid job to CE(s) via WMS under test. Accepts passive check updates from org.sam.WMS-JobMonit.
org.sam.WMS-JobMonit Monitors submitted grid jobs.
org.sam.WMS-JobSubmit Passive check. Holds terminal status of job submission.

org.sam.WMS-JobMonit

Monitors submitted grid jobs. By default invoked by Nagios each 5 min. Threaded implementation with one thread per monitored resource with max 10 threads. Passively updates org.sam.WMS-JobState with the latest state of the job according to WMS when job is not in a terminal state. When job enters terminal state or was canceled the metric updates both org.sam.WMS-JobState and org.sam.WMS-JobSubmit with the final job status. The latter metrics are updated (as passive checks) either via Naigos command file or NSCA. org.sam.WMS-JobSubmit is the metric which goes to Metric Store Database.

Job Submission

JDL template for job submission /usr/libexec/grid-monitoring/probes/org.sam/wnjob/org.sam.gridJob.WMS.jdl.template:

Type="Job";
JobType="Normal";
Executable = "<jdlExecutable>";
StdError = "gridjob.out";
StdOutput = "gridjob.out";
OutputSandbox = {"gridjob.out"};
RetryCount = <jdlRetryCount>;
ShallowRetryCount = <jdlShallowRetryCount>;
Requirements = <jdlReqCEInfoHostName>;

GoodCEs

  • jdlReqCEInfoHostName - is substituted in JDL with CEs given to org.sam.WMS-JobState by
    --ces-file <file>     File with list of CEs. Two schemes [file:] or http:
    

Default being /var/lib/gridprobes/<VO or FQAN>/GoodCEs. All CEs from the file are OR'ed in the resulting Requirements ClassAdd. Eg:

Requirements = (other.GlueCEInfoHostName == "ce106.cern.ch") || (other.GlueCEInfoHostName == "creamce.gina.sara.nl")

By default, if org.sam.WMS-JobState metric is defined in a profile, NCG will create a corresponding hr.srce.GoodCEs metric for populating /var/lib/gridprobes/<VO or FQAN>/GoodCEs file.

Troubleshooting

Check CE troubleshooting.

Document generated by Confluence on Feb 27, 2014 10:19