This page last changed on Feb 21, 2012 by prodrigu.

Release: Update-15

Summary

Start Date 26 September 2011
End Date 28 October 2011
Status Released
Release Date 28 October 2011
Release Manager Wojciech Lapka

Validation Steps performed

Outlined in SAM-2042

  • Announced for Stage Rollout - 03 November 2011
  • Cleared staged rollout - 18 November 2011

List of packages updated in this release

Node sam-nagios

atp-1.20.9-1.el5
atp-web-1.20.9-1.el5
glite-yaim-nagios-1.5.30-1.el5
grid-monitoring-config-gen-0.86.2-1.el5
grid-monitoring-probes-ch.cern.sam-1.5.0-1.el5
grid-monitoring-probes-eu.egi.sec-1.0.0-1.el5
grid-monitoring-probes-hr.srce-0.34.1-1.el5
grid-monitoring-org.nagiosexchange-probes-0.19-1.el5
mddb-1.0.14-1.el5
mddb-synchronizer-1.0.14-1.el5
mrs-1.3.10-1.el5
mywlcg-0.0.4-8.el5
poem-0.7.7-1.el5
poem-sync-0.7.7-1.el5
sam-nagios-1.15.3-1.el5
sam-release-1.15.0-1.el5
voms2htpasswd-1.11.0-1.el5
unicore-monitoring-probes-2.0.1-1
Django-1.1.4-1.el5

Removed:
grid-monitoring-probes-org.sam.sec
nagios2metricstore

Node sam-gridmon

ace-0.1.15-2.el5
atp-1.20.9-1.el5
atp-web-1.20.9-1.el5
glite-yaim-nagios-1.5.30-1.el5
mddb-1.0.14-1.el5
mrs-1.3.10-1.el5
msg-consume2db-1.0.22-1.el5
mywlcg-0.0.4-8.el5
poem-0.7.7-1.el5
poem-sync-0.7.7-1.el5
sam-gridmon-1.15.9-1.el5
sam-release-1.15.0-1.el5
voms2htpasswd-1.11.0-1.el5
Django-1.1.4-1.el5

Removed:
nagios2metricstore
msg-nagios-bridge

Other packages to add (from rpmforge):
awstats
geoip
perl-Geo-IP

Release Notes

  • ACE
    • Decreased delay in availability computations (now between 15 and 75 minutes)
    • Tuning of Oracle queries
    • Improved scheduling and logging mechanism
    • Capture snapshots of topology view
  • ATP
    • Support for EGI operational tools
    • Support for OSG preproduction services
    • Improved validation of synchronizers' input data
    • Automatic retrieval of service flavours from GOCDB
  • glite-yaim-nagios
    • Improve setup for uncertified sites in metric org.sam.CREAMCE-DirectJobState
  • grid-monitoring-probes-eu.egi.sec
    • grid-monitoring-probes-org.sam.sec renamed to grid-monitoring-probes-eu.egi.sec
  • grid-monitoring-probes-ch.cern.sam
    • New probe for monitoring entries in MRS tables metricdata_spool and metricdataforrecalculation (only node sam-gridmon)
  • info-provider-nagios
    • GLUE 2 publication in Nagios GlueService
  • MyWLCG
    • Reduced browser cache time to 10 minutes
    • Filters visible by default
    • Improvement of error messages in web services
    • Service Availability bar graphs in png format for reporting
    • Bug fixes
      • Wrong status displayed in some cases
      • Labels for metric names truncated in metrics status view
  • MRS
    • Package renamed from 'nagios2metricstore' to 'mrs'
    • Delete metrics older than 7 days from the latest view
    • Only sam-gridmon:
      • Oracle Tuning - created procedure for computation of statistics, rebuilding indexes and space shrinking on some of the objects
      • Script for dumping latest statuses from the DB
  • NCG
    • UNICORE metrics VO-independant
    • Control send-to-dashboard on backup instance
    • Improved workflow:
      • Check for running instances
      • Better handling of new configuration folder
  • POEM
    • The first testing version of Profile management system (Poem) is
      distributed in this release (poem-0.7.4-1.el5, poem-sync-0.7.4-1.el5).
    • Complete user guide and transition guide will be published as soon
      as Poem is fully tested and integrated with all other SAM components.
  • voms2htpasswd
    • Keep existing htpasswd if no entries are found
  • sam-gridmon
    • Added awstats

Configuration changes

  • YAIM variable for defining time zone for web interfaces
    TIME_ZONE

Configuration changes (only node sam-nagios)

  • ARC probes are part of the default NGI/ROC profile and have to be switched on with ENABLE_ARC_PROBES Yaim variable. For details see https://tomtools.cern.ch/confluence/display/SAM/SAM+setup+for+ARC+services. New configuration is:
    # switch on ARC probes
    ENABLE_ARC_PROBES=true
    # Configuring ARC profile is not needed
    # comment out the following line:
    # NCG_HASH_CONFIG_PROFILES=NGI,ARC
    
  • New YAIM variable for setting timeout for NCG
    NCG_TIMEOUT=1234
  • When configuring active/backup instance without Yaim manual switch off/on of service send-to-dashboard is needed.

Configuration changes (only node sam-gridmon)

  • Ask for explicit grant to create tables (not coming from role) for main Oracle account.
  • New YAIM configuration variables
    • Setting message destinations (default values as below):
      MSG_DEST_NGI=/topic/grid.probe.metricAggregation.EGEE.ALL-SITES
      MSG_DEST_VO=/topic/grid.probe.metricAggregation.VO.ALL-SITES
      MSG_DEST_OSG=/topic/grid.probe.metricOutput
      MSG_DEST_OSG_CATCHUP=/topic/grid.probe.catchup.metricOutput
  • RPM msg-nagios-bridge needs to be manually uninstalled after yum execution.

Known Issues

Please have also a look at the Installing SAM Update-15 guide, since the python-django exclusion is needed in the rpmforge.repo file.
We noticed that SRM-probe is reporting status information '(Return code of 139 is out of bounds)' for machines running latest glite-UI release (3.2.11-1).
The issue is being followed by the gLite Team: #89293
For machines running latest version of glite-UI (3.2.10-1 or higher):
Please restart Nagios after yaim execution. Otherwise you may see problems similar to SAM-1693.
service nagios restart
Please note that Poem is currently deployed only for testing purposes.
It introduces new Nagios probe ch.cern.sam.POEMSync, which status is
not critical to the overall functionality of SAM-Nagios.
This applies only for machines running at CERN:
Due to TINF-734 all SAM-Nagios nodes configured to manage OSG services must apply the following 3 patches: patch_SAM-2204, patch_SAM-2215, patch_SAM-2215b.
patch /usr/bin/atp_synchro patch_SAM-2204
patch /usr/bin/atp_synchro patch_SAM-2215
patch /usr/bin/atp_synchro patch_SAM-2215b

List of Issues fixed in this release

jiraissues: Unable to determine if sort should be enabled.

patch_SAM-2204 (application/octet-stream)
patch_SAM-2215 (application/octet-stream)
patch_SAM-2215b (application/octet-stream)
Document generated by Confluence on Feb 27, 2014 10:19