The Prometheus SNMP Exporter is deployed by the cray-sysmgmt-health
chart to the sysmgmt-health
namespace as part of
the Cray System Management (CSM) release.
Both the Prometheus SNMP Exporter and River Endpoint Discovery Service (REDS)
hardware discovery use SNMP credentials stored in Vault. These credentials are read from
Vault and pushed into customizations.yaml
as a sealed secret. For
the SNMP Exporter and REDS hardware discovery to work, three things need to have been done by an administrator:
customizations.yaml
(this is covered in the documentation linked in the previous bullet).Note: While the CSM Automatic Network Utility (CANU) will typically not overwrite SNMP settings that are manually applied to the management switches, there are certain cases where SNMP configuration can be over-written or lost (such as when resetting and reconfiguring a switch from factory defaults). To persist the SNMP settings, see CANU Custom Configuration. CANU custom configuration files are used to persist site management network configurations that are intended to take precedence over configurations generated by CANU.
If the Prometheus SNMP Exporter or REDS hardware discovery emit errors related to SNMP authentication, then an administrator should:
customizations.yaml
match Vault
and the credentials stored on the management switches.In order to provide data to the Grafana SNMP dashboards, the SNMP Exporter must be configured with a list of management network switches to scrape metrics from.
NOTE All variables used within this page depend on the
/etc/environment
setup done in Pre-installation.
(pit#
) Update customizations.yaml
with the list of switches to be monitored by the SNMP Exporter.
/usr/share/doc/csm/scripts/configure_snmp_monitor.py -c "${PITDATA}/prep/site-init/customizations.yaml" -s "${PITDATA}/prep/${SYSTEM_NAME}/sls_input_file.json"
Expected output looks similar to the following:
Switches to monitor for subnet HMN
[{'name': 'sw-spine-001', 'target': '10.254.0.2'},
{'name': 'sw-spine-002', 'target': '10.254.0.3'},
{'name': 'sw-leaf-bmc-001', 'target': '10.254.0.4'}]
Enabling prometheus-snmp-exporter serviceMonitor
Adding the targets to the SNMP serviceMonitor configuration
The HMN is used by default as ACLs in the switch configuration block SNMP over the NMN.
This can be overridden by passing -n NMN
to the configure_snmp_monitor.py
script.
(pit#
) Review the SNMP Exporter configuration.
yq4 eval '.spec.kubernetes.services.cray-sysmgmt-health.prometheus-snmp-exporter' "${PITDATA}/prep/site-init/customizations.yaml"
The expected output looks similar to:
serviceMonitor:
enabled: true
params:
- name: sw-spine-001
target: 10.254.0.2
- name: sw-spine-002
target: 10.254.0.3
- name: sw-leaf-bmc-001
target: 10.254.0.4
The most common configuration parameters are specified in the following table. They must be set in the customizations.yaml
file
under the spec.kubernetes.services.cray-sysmgmt-health.prometheus-snmp-exporter
service definition.
Customization | Default | Description |
---|---|---|
serviceMonitor.enabled |
true |
Enables serviceMonitor for SNMP Exporter (default chart value is true ) |
params.enabled |
true |
Sets the SNMP Exporter params change to true (default chart value is false ) |
params.conf.module |
if_mib |
SNMP Exporter to select which module (default chart value is if_mib ) |
params.conf.target |
10.252.0.2 |
Add list of switch targets to SNMP Exporter to monitor |
For a complete set of available parameters, consult the values.yaml
file for the cray-sysmgmt-health
chart.