Overview of RASLog messages

Reliability, Availability and Serviceability (RAS) log messages were named RASLog messages by IBM and are used to log system events that are related to configuration changes or system error conditions. Messages are reported at various levels of severity ranging from informational (INFO) to escalating error levels (WARNING, ERROR, and CRITICAL). SLX-OS maintains two separate internal message storage repositories, SYSTEM and DCE. The following table shows the message types that are stored in each repository. A RASLog message can have one or more type attributes. For example, a message can be of type DCE, FFDC, and AUDIT. A message cannot have both LOG and DCE type attributes.
Table 1. Message type matrix
Message type DCE message repository SYSTEM message repository
LOG No Yes
DCE Yes No
CFFDC Yes Yes
FFDC Yes Yes
AUDIT Yes Yes

RASLog message types

SLX-OS supports different types of RASLog messages. The following sections describe in detail the message types.

  1. System messages: System or LOG messages report significant system-level events or information and are also used to show the status of the high-level, user-initiated actions. System messages are stored in a separate nonvolatile storage and are preserved across the firmware upgrade or downgrade. The system messages are forwarded to the console, to the configured syslog servers, and through the SNMP traps or informs to the SNMP management station.

    The following example shows a system message.

    2017/09/14-23:26:44, [FW-1424], 620, M1 | Active, WARNING, SLX9850-8, Switch status changed from HEALTHY to MARGINAL.
    

    For information on displaying and clearing the system messages, refer to Viewing and clearing the RASLog messages.

  2. DCE RASLog messages: DCE RASLog messages report error-related events and information in the protocol-based modules such as network service module (NSM), system services manager (SSM), and so on. DCE messages are stored in a separate nonvolatile storage and are preserved across the firmware upgrades. The DCE messages are forwarded to the console, to the configured syslog servers, and through the SNMP traps or informs to the SNMP management station.

    The following example shows a DCE message.

    2017/09/14-23:26:25, [NSM-1004], 617, M1 | Active | DCE, INFO, SLX9850-8,  Vlan 1 is created.
    

    For information on displaying and clearing the DCE RASLog messages, refer to Viewing and clearing the RASLog messages

  3. AUDIT log messages: Event auditing is designed to support post-event audits and problem determination that are based on high-frequency events of certain types, such as security violations, firmware downloads, and configuration. AUDIT log messages are saved in the persistent storage. The storage has a limit of 1024 entries and wraps around if the number of messages exceeds the limit. The switch can be configured to stream AUDIT messages to the specified syslog servers. The AUDIT log messages are not forwarded to an SNMP management station.

    The following example shows an AUDIT log message.

    891 AUDIT, 2017/09/14-23:30:29 (GMT), [SEC-3024], INFO, SECURITY, NONE/root/NONE/None/CLI,, SLX9850-8, Event: passwd, Status: success, Info: User account [user], password changed.
    

    For any given event, AUDIT messages capture the following information:

    • User Name: The name of the user who triggered the action.

    • User Role: The access level of the user, such as root or admin.

    • Event Name: The name of the event that occurred.

    • Status: The status of the event that occurred: success or failure.

    • Event Info: Information about the event.

    The following table describes the three event classes that can be audited.

    Table 2. Event classes of the AUDIT messages
    Event class Operand Description
    DCMCFG CONFIGURATION You can audit all the configuration changes in the OS.
    FIRMWARE FIRMWARE You can audit the events occurring during the firmware download process.
    SECURITY SECURITY

    You can audit any user-initiated security event for all management interfaces. For events that have an impact on the entire network, an audit is generated only for the switch from which the event was initiated.

    You can enable event auditing by configuring the syslog daemon to send the events to a configured remote host by using the logging syslog-server command. You can set up filters to screen out particular classes of events by using the logging auditlog class command (the classes include SECURITY, CONFIGURATION, and FIRMWARE). All the AUDIT classes are enabled by default. The defined set of AUDIT messages are sent to the configured remote host in the AUDIT message format, so that they are easily distinguishable from other syslog events that may occur in the network. For details on how to configure event auditing, refer to Viewing, clearing, and configuring AUDIT log messages.

  4. FFDC messages: First Failure Data Capture (FFDC) is used to capture failure-specific data when a problem or failure is first noted and before the switch reloads or the trace and log buffer get wrapped. All subsequent iterations of the same error are ignored. This critical debug information is saved in nonvolatile storage and can be retrieved by entering the copy support command. The data are used for debugging purposes. FFDC is intended for use by Extreme technical support. FFDC is enabled by default. Enter the support command to enable or disable FFDC. If FFDC is disabled, the FFDC daemon does not capture any data, even when a message with FFDC attributes is logged.

    The following example shows an FFDC message.

    2017/09/14-23:28:18, [HASM-1200], 666, L1/0 | Active | FFDC, WARNING, SLX9850-8, Detected termination of process hslagtd:2915.
    

    You can display the FFDC messages by using the show logging raslog attribute FFDC command. For information on displaying the FFDC RASLog messages, refer to Viewing and clearing the RASLog messages.

  5. CFFDC messages: Chassis-wide FFDC (CFFDC) is used to capture FFDC data for every management module (MM) or line card (LC) in the entire chassis for failure analysis. This debug information is saved in nonvolatile storage and can be retrieved by entering the copy support command. If FFDC is disabled, the CFFDC data is not captured even when a message with the CFFDC attribute is logged.

    The following example shows a CFFDC message.

    2017/10/14-10:36:51, [EM-1100], 28749, M2 | Active | CFFDC, CRITICAL, SLX9850-8, Unit in L3 with ID 127 is faulted(119). 1 of 1 total attempt(s) at auto-recovery is being made. Delay is 60 seconds.

Message severity levels

Messages have four levels of severity, ranging from CRITICAL to INFO. In general, the definitions are wide ranging and are to be used as general guidelines for troubleshooting. In all cases, you must look at each specific error message description thoroughly before taking action. The following table lists the RASLog message severity levels.

Table 3. Severity levels of the RASLog messages
Severity level Description
CRITICAL

A CRITICAL message indicates that the software has detected serious problems that cause a partial or complete failure of a subsystem if not corrected immediately; for example, a power supply failure or rise in temperature must receive immediate attention.

ERROR

An ERROR message represents an error condition that does not affect overall system functionality significantly. For example, an ERROR message may indicate a timeout on a certain operation, a failure of a certain operation after a retry, an invalid parameter, or a failure to perform a requested operation.

WARNING

A WARNING message highlights a current operating condition that must be checked or it may lead to a failure in the future. For example, a power supply failure in a redundant system relays a warning that the system is no longer operating in redundant mode unless the failed power supply is replaced or fixed.

INFO

An INFO message reports the current nonerror status of the system components; for example, detecting online and offline status of an interface.

RASLog message logging

The RASLog service generates and stores messages that are related to abnormal or erroneous system behavior. It includes the following features:

Extreme recommends that you configure the system logging daemon (syslogd) facility as a management tool for error logs. For more information, refer to Configuring the syslog message destinations.