This page last changed on Jul 05, 2011 by wlapka.

Release: Update-11

Summary

Start Date 11 Apr 2011
End Date 17 May 2011
Status Released
Release Date 22 June 2011
Release Manager Wojciech Lapka

Validation Steps performed

  • Outlined in SAM-1491
  • Announced for Stage Rollout - 17 May 2011 (Update 11), 14 June 2011 (Update 11.1)
  • Cleared staged rollout - 22 June 2011

List of new metrics

CE:

CREAMCE:

LB:

MyProxy:

Top-BDII & Site-BDII:

WMS:

Nagios:

GRAM5:

globus-GRIDFTP:

globus-GSISSHD:

- internal checks which are not propagated to MyEGI
- check will only appear if Nagios is monitoring WMS services

List of packages updated in this release

Node egee-NAGIOS

egee-NAGIOS-1.0.0-61.el5
atp-1.16.10-2.el5
atp-web-1.16.10-2.el5
glite-yaim-nagios-1.4.3-3.el5
grid-monitoring-config-gen-0.79.1-1.el5
grid-monitoring-probes-cadist-0.3.0-1.el5
grid-monitoring-probes-org.sam-0.1.20-2.el5
grid-monitoring-probes-org.sam.sec-0.3.1-1.el5
myegi-0.5.5-1.el5
nagios2metricstore-1.0.34-2.el5
nagios-gocdb-downtime-0.24.1-1.el5
sam-release-1.11.0-1.el5

Node egee-NAGIOS-WEB

egee-NAGIOS-WEB-0.9.0-5.el5
ace-0.1.6-2.el5
atp-1.16.10-2.el5
atp-web-1.16.10-2.el5
glite-yaim-nagios-1.4.3-3.el5
myegi-0.5.5-1.el5
nagios2metricstore-1.0.34-2.el5
poem-0.3-1.el5
sam-release-1.11.0-1.el5

Release Notes

  • /etc/atp/atp_synchro.conf is created by YAIM.
  • Added tests for Globus5 service:
    • GRAM5
    • globus-GRIDFTP
    • globus-GSISSHD
  • Support for uncertified sites running CREAM-CE
  • Removed metric org.ggus.Tickets
  • MyEGI web services - bug fixes
  • ACE - recomputation of availabilities when data received with a delay
  • ACE/ATP/MRS - bug fixes
  • Change of help URLs for "org.sam.*" probes

Configuration changes

  • New YAIM configuration variables
    # switch host checks off/on (see SAM-1173) (optional variable)
    NCG_CHECK_HOSTS=1
    
    # change GOCDB root URL (see SAM-1419) (optional variable)
    GOCDB_ROOT_URL=https://goc.egi.eu/gocdbpi/
    
    # if LDAP topology is used, control adding hosts (see SAM-1470) (optional variable)
    NCG_LDAP_ADD_HOSTS=1
    
    # switch off importing admin DNs (see SAM-1434) (optional variable)
    NCG_CONTACTS_USE_GOCDB=false
  • Removed Yaim variable
    ROC_NAME=...
    # use NCG_GOCDB_ROC_NAME instead
Note that after upgrade and before YAIM execution, glite-info-service-nagios.conf.rpmnew configuration file should replace the existing one, i.e.
mv /opt/glite/etc/glite-info-service-nagios.conf.rpmnew /opt/glite/etc/glite-info-service-nagios.conf

If the .rpmnew file doesn't exist make sure that file contains the following line:

get_data = echo -e "Role=${NAGIOS_ROLE}\nMsgNagiosDestination="$(. /etc/sysconfig/msg-to-queue && echo $MSG_TO_QUEUE_DESTINATION)"\nVersion="$(cat /etc/sam-release)
Globus services currently do not support VOs. In order to monitor Globus services SAM administrator has to contact all sites and request to add the certificate DN to the grid-mapfile.

Known Issues

For machines running latest version of glite-UI (3.2.10-1):

Please restart Nagios after yaim execution. Otherwise you may see problems similar to SAM-1693.

service nagios restart
In new installations please add following line to file /etc/my.cnf and restart mysql:
[mysqld]
event-scheduler=1

If the monitoring infrastructure contains WMS service and no CE services, metric hr.srce.GoodCEs associated to Nagios service will fail with the following error:

HealthyNodes CRITICAL - No healthy hosts found.

In order to fix it create file /etc/ncg/ncg-localdb.d/GoodCEs-fix with the following content:

MODIFY_METRIC_PARAMETER!hr.srce.GoodCEs!--metric!org.sam.CREAMCE-JobSubmit

and restart service ncg:

service ncg restart

List of Issues fixed in this release

jiraissues: Unable to determine if sort should be enabled.
Document generated by Confluence on Feb 27, 2014 10:19