31000 | EFA Certificate Expiry Notice | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Description | Send an alert when an EFA certificate is about to expire. | ||||||||||||||||||||||||
Preconditions |
Certificate Manager Component (Monitor & Auth Service) has system default settings that are NOT user configurable.
The daily polling sends the “CertificateExpiryNoticeAlert” event notification with an expiry date per certificate type which is processed by the fault engine. |
||||||||||||||||||||||||
Requirements |
Alert Data:
Syslog RFC-5424 Example: <116>1 2003-10-11T22:14:15.003Z efa.machine.com EFAFaultManager - Environmental [meta sequenceId=”47”] [origin ip=”10.20.30.40” enterpriseId=”1916” software=”EFA” swVersion=”3.1.0”] [alert@1916 resource=”/App/System/Security/Certificate?type=app_server_cert” alertId=”31000” cause=”keyExpired” type=”securityServiceOrMechanismViolation” severity=”warning”] [alertData@1916 certifcateType=”App Server certificate” expiryDate=”Sep 12 10:00:45 2022 GMT”] BOM The application server certificate on EFA will expire soon |
||||||||||||||||||||||||
Health Response | Response{ Resource: /App/System/Security/Certificate?type=app_server_cert HQI { Color: Yellow Value: 1 } StatusText: Application Server Certificate expires on <date>. } |
31001 | Managed Device Certificate Expiry Notice | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Description | Send an alert when a certificate on the SLX device is about to expire. | ||||||||||||
Preconditions |
Inventory Service has default system settings that are NOT user configurable.
The daily polling sends the “DeviceCertificateExpiryNoticeAlert” event notification with an expiry date per certificate type which is processed by the fault engine. |
||||||||||||
Requirements |
Alert Data:
Syslog RFC-5424 Example: <116>1 2022-10-11T22:14:15.003Z efa.machine.com EFAFaultManager - Environmental [meta sequenceId=”47”] [origin ip=”10.20.30.40” enterpriseId=”1916” software=”EFA” swVersion=”3.1.0”] [alert@1916 resource=”/App/System/Security/Certificate?device_ip=10.10.10.1&type=http_server_cert” alertId=”31001” cause=”keyExpired” type=”securityServiceOrMechanismViolation” severity=”warning”] [alertData@1916 deviceIP=”10.10.10.1” certifcateType=”HTTPS Server certificate” expiryDate=”Sep 12 10:00:45 2022 GMT”] BOMThe certificate on device “10.10.10.1” with subject “CN=slx-10.10.10.1.extremenetworks.com” will be expiring soon at “Sep 12 10:00:45 2022 GMT” |
||||||||||||
Health Response |
Response
{ Resource:/App/System/Security/Certificate?device_ip=10.10.10.1&type=http_server_cert HQI { Color: Yellow Value: 1 } StatusText: Device 10.10.10.1 Http Server Certificate expires on <date>. } |
31002 | EFA Certificate Expired | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Description | Send an alert when an EFA certificate has expired. This might not work well in a few cases as the system would not be functional. | |||||||||||||||
Preconditions |
K3s must be up and running (k3s goes down if k3s certs have expired) Only supports non-k3s cert expiry.
If the App server certificate gets expired, you cannot communicate with EFA via REST API. Therefore, you cannot query the health status. |
|||||||||||||||
Requirements |
Alert Data:
Syslog RFC-5424 Example: <114>1 2003-10-11T22:14:15.003Z efa.machine.com EFAFaultManager - Environmental [meta sequenceId=”47”] [origin ip=”10.20.30.40” enterpriseId=”1916” software=”EFA” swVersion=”3.1.0”] [alert@1916 resource=”/App/System/Certificate?type=app_server_cert” alertId=”31002” cause=”keyExpired” type=”securityServiceOrMechanismViolation” severity=”critical”] [alertData@1916 certificateType=”App Server certificate” expiredDate=”Sep 12 10:00:45 2022 GMT”] BOMThe Application server certificate on EFA has expired “Sep 12 10:00:45 2022 GMT”. |
|||||||||||||||
Health Response |
Response
{ Resource: /App/System/Security/Certificate type=app_server_cert HQI { Color: Red Value: 3 } StatusText: Application server Certificate expired on <date>. } |
31003 | Managed Device Certificate Expired | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Description | Send an alert when the SLX certificate has expired | ||||||||||||
Preconditions |
SLX device‘s syslog server configuration is set to EFA IP so that the RASLog service receives events from the SLX device. Syslogs from the SLX device may not be sent to the RASLog Service if syslog CA cert has expired.
The daily polling sends the “DeviceCertificateExpiredNoticeAlert” event notification with an expiry date per certificate type which is processed by the fault engine. |
||||||||||||
Requirements |
Alert Data:
Syslog RFC-5424 Example: <114>1 2022-10-11T22:14:15.003Z efa.machine.com EFAFaultManager - Environmental [meta sequenceId=”47”] [origin ip=”10.20.30.40” enterpriseId=”1916” software=”EFA” swVersion=”3.1.0”] [alert@1916 resource=”/App/System/Security/Certificate?device_ip=10.10.10.1&type=https_server_cert” alertId=”31003” cause=”keyExpired” type=”securityServiceOrMechanismViolation” severity=”critical”] [alertData@1916 deviceIP=”10.10.10.1” certifcateType=”HTTPS Server certificate” expiryDate=”Sep 12 10:00:45 2022 GMT”] BOMThe certificate on device “10.10.10.1” with subject “CN=slx- 10.10.10.1.extremenetworks.com” has expired at “Sep 12 10:00:45 2022 GMT” |
||||||||||||
Health Response |
Response { Resource:/App/System/Security/Certificate?device_ip=10.10.10.1&type=https_server_cert HQI { Color: Red Value: 3 } StatusText: Https server certificate x=on device 10.10.10.1 expired on <date>. } |
31004 | EFA Certificate Upload/Renewal | ||||||||||||||||||||||||
Description | Send an alert when a certificate is renewed. | ||||||||||||||||||||||||
Preconditions |
|
||||||||||||||||||||||||
Requirements |
Alert Data:
Syslog RFC-5424 Example: Syslog RFC-5424 Example: <116>1 2003-10-11T22:14:15.003Z efa.machine.com EFAFaultManager - Environmental [meta sequenceId=”47”] [origin ip=”10.20.30.40” enterpriseId=”1916” software=”EFA” swVersion=”3.1.0”] [alert@1916 resource=”/App/System/Security/Certificate?type=app_server_cert” alertId=”31004” cause=”keyGenerated” type=”securityServiceOrMechanismViolation” severity=”warning”] [alertData@1916 certifcateType=”App Server certificate”] BOMThe application server certificate on EFA has been renewed |
||||||||||||||||||||||||
Health Response |
Response
{ Resource: /App/System/Security/Certificate?type=app_server_cert HQI { Color: Green Value: 0 } StatusText: Application server certificate renewed by user <user>. } |
31005 | Managed Device Certificate Upload or Renewal | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Description | Send an alert when a certificate is renewed. | |||||||||
Preconditions |
For all the certificates managed by EFA, an alert is sent on
renewal of any of the certificates.
|
|||||||||
Requirements |
Alert Data:
Syslog RFC-5424 Example: <118>1 2003-10-11T22:14:15.003Z efa.machine.com EFAFaultManager - Environmental [meta sequenceId=”47”] [origin ip=”10.20.30.40” enterpriseId=”1916” software=”EFA” swVersion=”3.1.0”] [alert@1916 resource=”/App/System/Security/Certificate?device_ip=10.10.10.1&type=https_server_cert” alertId=”31005” cause=”keyGenerated” type=”securityServiceOrMechanismViolation” severity=”info”] [alertData@1916 deviceIP=”10.10.10.1” certifcateType=”HTTPS certificate”] BOMThe device 10.10.10. 1HTTPS server certificate has been renewed. |
|||||||||
Health Response |
Response
{ Resource:/App/System/Security/Certificate?device_ip=10.10.10.1&type=https_server_cert HQI { Color: Green Value: 0 } StatusText: Device 10.10.10.1 Https server certificate was renewed by user <user>. } |
31010 | Security Level Thresholds (Login attempts) |
---|---|
Description | Send an alert when a user login attempt to EFA fails. |
Preconditions | None |
Requirements |
Alert Data:
Syslog RFC-5424 Example:
<114>1 2003-10-11T22:14:15.003Z efa.machine.com EFAFaultManager - Environmental [meta sequenceId=”47”] [origin ip=”10.20.30.40” enterpriseId=”1916” software=”EFA” swVersion=”3.1.0”] [alert@1916 alertId=”31010” cause=”credentialError” type=”securityServiceOrMechanismViolation” severity=”major”] [alertData@1916 userName=”bob”] BOMFailed login attempt. |
Health Response | N/A |
31011 | Login Successful |
---|---|
Description | Send an alert when a user successfully logs in to EFA. |
Preconditions | None |
Requirements |
Alert Data:
Syslog RFC-5424
Example:
<118>1 2003-10-11T22:14:15.003Z efa.machine.com EFAFaultManager - Environmental [meta sequenceId=”47”] [origin ip=”10.20.30.40” enterpriseId=”1916” software=”EFA” swVersion=”3.1.0”] [alert@1916 resource=”/App/System/Security/Authentication” alertId=”31011” cause= type= severity=”info”] [alertData@1916 userName=”bob”] BOM Successful login. |
Health Response | N/A |
31030 | LDAP Connectivity |
---|---|
Description | Send an alert when LDAP server configured in EFA is not reachable. |
Preconditions |
The polling is enabled only if:
|
Requirements | Alert Data:
Syslog RFC-5424
Example:
<115>1 2003-10-11T22:14:15.003Z efa.machine.com EFAFaultManager - Environmental [meta sequenceId=”47”] [origin ip=”10.20.30.40” enterpriseId=”1916” software=”EFA” swVersion=”3.1.0”] [alert@1916 resource=”/App/System/Security/Authentication” alertId=”31030” cause=”underlyingResourceUnavailable” type=”communicationsAlarm” severity=”major”] [alertData@1916 ldapServerIP=”10.x.x.x” reason=”Unable to reach the LDAP server”] BOMThe connection to LDAP Server could not be established. |
Health Response |
Response
{ Resource: /App/System/Security/Authentication HQI { Color: Yellow Value: 1 } StatusText: Failed to connect to LDAP server at <time>. } |
31040 | Storage Utilization Threshold |
---|---|
Description | Send an alert per monitored TPVM mount point when capacity has reached 75% utilization or more. |
Preconditions |
System Component (Monitor Service) has system default settings that are NOT user configurable.
The hourly polling sends an “Alert” event notification with the TPVM storage utilization percentage which is processed by the fault engine. |
Requirements |
Alert Data:
Syslog RFC-5424 Example:
<116>1 2003-10-11T22:14:15.003Z efa.machine.com EFAFaultManager - Operational [meta sequenceId=”47”] [origin ip=”10.20.30.40” enterpriseId=”1916” software=”EFA” swVersion=”3.1.0”] [alert@1916 resource=”/App/System/Storage?node_ip=10.2.3.4&mount_point=%2F” alertId=”31040” cause=”storageCapacityProblem” type=”processingErrorAlarm” severity=”warning”] [alertData@1916 nodeIP=”10.2.3.4” mountPoint=”/” usedMB=”7114” availableMB=”2371” utilizationPercent=”75”] BOMThe Node IP “10.2.3.4” mount point “/” has reached a storage utilization of 75% with 2.371 GB free. |
Health Response |
Response
{ Resource: /App/System/Storage?node_ip=10.2.3.4&mount_point=%2F” HQI { Color: Yellow Value: 1 } StatusText: Disk partition <partition name> is <x %> full on node 10.2.3.4. } |
31041 | Storage Utilization Full |
---|---|
Description | Send an alert per monitored TPVM mount point when available storage is less than or equal to 1000 MB. |
Preconditions |
System Component (Monitor and System Service) has system default settings that are NOT user configurable.
The hourly polling sends an “Alert” event notification with the TPVM storage utilization percentage which is processed by the fault engine. |
Requirements |
Alert Data:
Syslog RFC-5424 Example:
<113>1 2003-10-11T22:14:15.003Z efa.machine.com EFAFaultManager - Operational [meta sequenceId=”47”] [origin ip=”10.20.30.40” enterpriseId=”1916” software=”EFA” swVersion=”3.1.0”] [alert@1916 resource=”/App/System/Storage?node_ip=10.2.3.4&mount_point=%2F”” alertId=”31041” cause=”storageCapacityProblem” type=”processingErrorAlarm” severity=”alert”] [alertData@1916 nodeIP=”10.2.3.4” mountPoint=”/” usedMB=”9485” availableMB=”0” utilizationPercent=”100”] BOMThe Node IP “10.2.3.4” mount point “/” storage is full. |
Health Response |
Response
{ Resource: /App/System/Storage?node_ip=10.2.3.4&mount_point=%2F” HQI { Color: Red Value: 3 } StatusText: Disk partition <partition name> is <x %> full on node 10.2.3.4. } |
31042 | Storage Utilization Check |
---|---|
Description | Send an alert per monitored TPVM mount point when capacity has reached safe levels under 75% utilization. |
Preconditions |
System Component (Monitor Service) has system default settings that are NOT user configurable.
The “Under 75%” info level storage threshold alert is sent once on Monitor Service startup and once to clear the unhealthy storage resource path. All other severities, higher than info level, are continually sent at the polling frequency. |
Requirements |
Alert Data:
Syslog RFC-5424
Example:
<118>1 2003-10-11T22:14:15.003Z efa.machine.com EFAFaultManager - Operational [meta sequenceId=”47”] [origin ip=”10.20.30.40” enterpriseId=”1916” software=”EFA” swVersion=”3.1.0”] [alert@1916 resource=”/App/System/Storage?node_ip=10.2.3.4&mount_point=%2F”” alertId=”31042” cause=”storageCapacityCheck” type=”processingErrorAlarm” severity=”info”] [alertData@1916 nodeIP=”10.2.3.4” mountPoint=”/” usedMB=”6243” availableMB=”4732” utilizationPercent=”62”] BOMThe Node IP “10.2.3.4” mount point “/” is at a safe storage utilization of 62% with 4.732 GB free. |
Response
{ Resource: /App/System/Storage?node_ip=10.2.3.4&mount_point=%2F” HQI { Color: Green Value: 0 } StatusText: Disk partition <partition name> is at a safe storage utilization of <x %> on node 10.2.3.4. } |
31050 | HA Service (Non-Redundant) |
---|---|
Description | Send an alert when the standby is not up which indicates that the system isn‘t fully redundant. |
Preconditions |
EFA 3.1.0 has a timer task to periodically monitor the status of the standby node. The timer task checks the status of the nodes and raise an event to the fault management system, which in-turn will raise an Alert to indicate to the user that the system isn‘t fully redundant.
|
Requirements |
Alert Data:
Syslog RFC-5424 Example:
<116>1 2003-10-11T22:14:15.003Z efa.machine.com EFAFaultManager - Operational [meta sequenceId=”47”] [origin ip=”10.20.30.40” enterpriseId=”1916” software=”EFA” swVersion=”3.1.0”] [alert@1916 resource=”/App/System/HA” alertId=”31050” cause=”lossOfRedundancy” type=”operationalViolation” severity=”warning”] BOMHA Degraded. |
Health Response |
Response
{ Resource: /App/System/HA HQI { Color: Yellow Value: 1 } StatusText: HA Degraded. } |
31051 | HA Service (Fully Redundant) |
---|---|
Description | Send an alert when the standby is up and ready. This indicates to the user that the system is fully redundant. |
Preconditions |
A timer task will periodically check the status of the nodes and raise an event to the fault management system, which in turn will raise an Alert to indicate to the user that the system is fully redundant.
|
Requirements |
Alert Data:
Syslog RFC-5424 Example:
<118>1 2003-10-11T22:14:15.003Z efa.machine.com EFAFaultManager - Operational [meta sequenceId=”47”] [origin ip=”10.20.30.40” enterpriseId=”1916” software=”EFA” swVersion=”3.1.0”] [alert@1916 resource=”/App/System/HA” alertId=”31051” cause=”redundancyRestored” type=”operationalViolation” severity=”info”] BOMHA fully redundant |
Health Response |
Response
{ Resource: /App/System/HA HQI { Color: Green Value: 0 } StatusText: HA fully redundant. } |
31052 | HA Service (Failover Occurred) |
---|---|
Description | Send an alert when an HA failover has occurred. |
Preconditions |
A timer task will periodically check the status of the nodes and raise an event to the fault management system, which in turn will raise an Alert to indicate to the user that an HA failover has occurred.
|
Requirements |
Alert Data:
Syslog RFC-5424
Example
<116>1 2003-10-11T22:14:15.003Z efa.machine.com EFAFaultManager - Operational [meta sequenceId=”47”] [origin ip=”10.20.30.40” enterpriseId=”1916” software=”EFA” swVersion=”3.1.0”] [alert@1916 resource=”/App/System/HA/Nodes/Node” alertId=”31052” cause=”localNodeTransmissionError” type=”operationalViolation” severity=”warning”] [alertData@1916 activeIP=”10.1.2.3”] BOM10.1.2.3 is now the HA active node |
Health Response |
Response
{ Resource: /App/System/HA HQI { Color: Yellow Value: 1 } StatusText: <Active IP> is now the HA active node. } |
31501 | Managed Device Connectivity Loss |
---|---|
Description | Send an alert when EFA gets disconnected from SLX. |
Preconditions |
The polling is enabled only if
Example (User Configuration):
The polling sends the “DeviceConnectivityFailureAlert” event notification upon loss of contact. |
Requirements |
Alert Data:
Syslog RFC-5424 Example:
<114>1 2022-10-11T22:14:15.003Z efa.machine.com EFAFaultManager - Environmental [meta sequenceId=”47”] [origin ip=”10.20.30.40” enterpriseId=”1916” software=”EFA” swVersion=”3.1.0”] [alert@1916 resource=”/App/Component/Asset/Device?device_ip=10.10.10.1” alertId=”31501” cause=”connectionEstablishmentError” type=”communicationsAlarm” severity=”major”] [alertData@1916 deviceIP=”10.10.10.1” failedAdapters=”ssh rest netconf” failureReason=”Authentication failed” BOMContact has been lost with device “10.10.10.1” |
Health Response |
Response
{ Resource: /App/Component/Asset/Device?device_ip=10.10.10.1 HQI { Color: Red Value: 1 } StatusText: Contact has been lost with device <Device IP>. } |
31502 | Managed Device Connectivity Reestablished |
---|---|
Description | Send an alert when the SLX device is not reachable. |
Preconditions |
The polling will be enabled only if
Example (User Config):
The polling sends the “DeviceConnectivitySuccessAlert” event notification upon loss of contact. |
Requirements |
Alert Data:
Syslog RFC-5424
Example:
<118>1 2022-10-11T22:14:15.003Z efa.machine.com EFAFaultManager - Environmental [meta sequenceId=”47”] [origin ip=”10.20.30.40” enterpriseId=”1916” software=”EFA” swVersion=”3.1.0”] [alert@1916 resource=”/App/Component/Asset/Device?device_ip=10.10.10.1” alertId=”31502” cause=”connectionEstablished” type=”communicationsAlarm” severity=”info”] [alertData@1916 deviceIP=”10.10.10.1” BOMContact has been regained with device “10.10.10.1”. |
Health Response |
Response
{ Resource: /App/Component/Asset/Device?device_ip=10.10.10.1 HQI { Color: Green Value: 0 } StatusText: Contact has been regained with device <Device IP>. } |