This page last changed on Apr 26, 2013 by roveznav.
Summary
Start Date |
29 October 2012 |
Release Date |
07 December 2012 |
Status |
Released |
Validation Steps |
SAM-3071 |
Validated |
06 March 2013 |
Description
This release was mainly focused on the introduction of SAM Operational Tools Monitoring. In addition we worked on bug fixing identified during the wide deployment of SAM Update 19.
Technical details:
- 70 tickets resolved
- Topology aggregation:
- Added sanity check to compare differences between services declared in local ATP with central ATP
- Enabled VOMS CSRF support
- Fixed problem of invalid json output provided by ATP PI, which was affecting NCG
- Profile Management:
- Unit test moved to Django 1.3
- MyWLCG changes:
- MyWLCG Error Handling improved
- Status view improved
- Added monthly reports to the central MyEGI
- Two new reports for T0/T1 sites
- Nagios configuration:
- Enabled defining contacts for services
- ncg.localdb partially migrated to ncg-metric-config (json format)
- Optimized Java truststore generation
- Added option to give permissions to run Nagios commands to anybody with a valid D
- Probes:
- grid-monitoring-probes-ch.cern.sam:
- MrsCheckDBInsertsDetailed probe improved
- SamCheckUpdate probe improved
- Added SAMCentralWebAPI probe
- hr.srce.GridProxy-Get generates proxies with configurable lifetime
- Probe libraries ported to SL6 (perl-TOM, python-GridMon and perl-GridMon)
- Fixed mta-simple problem in grid-monitoring-probes-org.sam
- MRS metrics disabled on SAM/Nagios nodes
- SAM configuration changes (glite-yaim-nagios):
- Consolidated/minimized number of httpd actions
- Consolidation of YAIM variables (names uppercase)
- OPS-MONITOR established as new SAM/Nagios configuration (for monitoring operational tools)
- EGI report: switched Availability and Reliability labels
- Decommission of MDDB
- Improved MySQL database dump
- Source code documentation
- Removed dependencies on DAG repository
Package List
SAM-Nagios
SAM-Gridmon
Configuration Changes
Common
- New Yaim configuration variables:
Component |
Name |
Description |
Default |
Mandatory |
Example |
ATP |
MSG_DEST_OSG_DOWNTIME |
Messaging queue for OSG downtimes |
Yes |
Yes |
/topic/grid.management.downtime.RSV |
SAM-Gridmon
- New Yaim configuration variables
Component |
Name |
Description |
Default |
Mandatory |
Example |
MyWLCG |
MYWLCG_TRENDS |
Enable/disable Trends on MyWLCG |
No |
No |
false |
SAM-Nagios
- New Yaim configuration variables:
Component |
Name |
Description |
Default |
Mandatory |
Example |
nagios |
NAGIOS_ENABLE_ANY_DN |
Enable/disable of any DN to run Nagios commands on VO-nagioses |
Yes |
No |
false |
NCG |
CRO_BROKER_PASS |
BROKER_PASSWORD value for msg.cro-ngi.hr broker |
No |
No |
MyPass |
NCG |
EGI1_BROKER_PASS |
BROKER_PASSWORD value for egi-1.msg.cern.ch broker |
No |
No |
MyPass |
NCG |
EGI2_BROKER_PASS |
BROKER_PASSWORD value for egi-2.msg.cern.ch broker |
No |
No |
MyPass |
NCG |
GR_BROKER_PASS |
BROKER_PASSWORD value for broker.afroditi.hellasgrid.gr broker |
No |
No |
MyPass |
NCG |
NCG_SERVICE_NOTIFICATIONS_OPTIONS |
Nagios notification options for services |
No |
No |
u,c |
NCG |
NCG_HOST_NOTIFICATIONS_OPTIONS |
Nagios notification options for hosts |
No |
No |
'cn' |
NCG |
NCG_PROXY_LIFETIME |
Defines the lifetime option to refresh_proxy probe in grid-monitoring-probes-hr.srce. If unset, probe will use default value 12. (Make sure that MyProxy server supports defined credential lifetime, otherwise the probe will fail) |
No |
No |
12 |
Localdb changes
- Migration of localdb metric configuration to JSON config
| JSON config files stored in /etc/ncg-metric-config.d/ must use .conf suffix. Otherwise NCG will ignore them. |
Modification and overriding of global values NCG localdb format:
# add/modify configuration parameter
MODIFY_METRIC_CONFIG!metric!config!value
# add/modify dependency
MODIFY_METRIC_DEPENDENCY!metric!dep!value
# add/modify attribute
MODIFY_METRIC_ATTRIBUTE!metric!attr!value
# add/modify parameter
MODIFY_METRIC_PARAMETER!metric!param!value
# add/modify flag
MODIFY_METRIC_FLAG!metric!flag!value
should be done by using JSON configuration files in directory /etc/ncg-metric-config.d/:
# cat /etc/ncg-metric-config.d/modify.conf
{
"metric" : {
"attribute" : {
"attr" : "value"
},
"parameter" : {
"param" : "value"
},
"flags" : {
"flag" : "value"
},
"config" : {
"config" : "value"
},
"dependency" : {
"dep" : 1
},
}
}
- Enable contacts for service flavour on a given host
Use NCG localdb to add/enable contacts for all metrics associated to a given service flavour:
ADD_SERVICEFLAVOURCONTACT!host!ServiceFlavour!email@email.com
# enables contact even if NCG_ENABLE_NOTIFICATIONS is 0
ENABLE_SERVICEFLAVOURCONTACT!host!ServiceFlavour!email@email.com
Known Issues
| The packages sam-nagios-1.20.0-1.el5.noarch.rpm and sam-release-1.20.0-1.el5.noarch.rpm are not in the EGI repository (therefore setting a wrong version in /etc/sam-release for those using the EGI repo). |
| For machines running latest version of glite-UI (3.2.10-1 or higher):
Please restart Nagios after yaim execution. Otherwise you may see problems similar to SAM-1693.
|
| Upgrading a node with yum requires a package exclusion, e.g.:
- on sam-nagios
yum update --exclude sam-gridmon
- on sam-gridmon
yum update --exclude sam-nagios
|
| Metrics ch.cern.sam.MrsCheckDBInserts and ch.cern.sam.MrsCheckDBInsertsDetailed have to be disabled manually.
Please add following lines to file /etc/ncg/ncg.localdb
REMOVE_METRIC!ch.cern.sam.MrsCheckDBInserts
REMOVE_METRIC!ch.cern.sam.MrsCheckDBInsertsDetailed
|
Tickets List
jiraissues: Unable to determine if sort should be enabled.
|