A device's good health status and the adequate performance of various resources used by it ensures that the device is working as intended and is continuously available for performing its task.
Monitoring the available resources and generating traps and alerts when pre-set user configured thresholds are breached, is the fastest way to react to changes in the monitored device's health and performance.
SLX-OS enables monitoring of system and Layer 2 & Layer 3 hardware resources. Thresholds can be set for each parameter being monitored. Both high threshold and low threshold values can be configured for each parameter.
When a threshold is breached, that is, when the monitored parameter exceeds the configured high threshold value, or, when the parameter falls below the lower threshold, you can take one of the multiple actions that are available for use.
The following actions are available:
These actions ensure that an anomalous event is recorded in one form or another, and then, can be used later to troubleshoot or for RCA of the issue.
Since it is possible that multiple entries/traps can be generated for an event, options are available to rate limit the generation of these events/traps. Before SLX-OS 20.5.3, rate limiting was configurable on a per monitored parameter basis. However, from SLX-OS 20.5.3 onwards, rate limiting is configured globally and is applicable to all monitored parameters.
When configured, the constraints can be summed as: Generate a maximum of X messages in Y units of time. Messages can be RASLOG or SNMP or Both and this depends on the action configured for that specific monitored parameter.
The following hardware parameters are monitored:
The following parameters are also monitored: