configure sys-recovery-level slot

configure sys-recovery-level slot [all | slot_number] [none | reset | shutdown]

Description

Configures a recovery option for instances where an exception occurs on the specified MSM/MM or I/O module.

Syntax Description

all Specifies all slots of the MSM/MM and I/O module.
slot_number Specifies the slot of the MSM/MM or I/O module.
  • A and B—Indicate an MSM/MM
  • 1 through 10—Indicate an I/O module
none Configures the MSM/MM or I/O module to maintain its current state regardless of the detected hardware fault. The offending MSM/MM or I/O module is not reset. For more information about the states of an MSM/MM or I/O module see the show slot command.
reset Configures the offending MSM/MM or I/O module to reset upon a hardware fault detection. For more detailed information, see the Usage Guidelines.
shutdown Configures the switch to shut down all slots/modules configured for shutdown upon fault detection. On the modules configured for shutdown, all ports in the slot are taken offline in response to the reported errors; however, the MSMs/MMs remain operational for debugging purposes only. ExtremeXOS logs fault, error, system reset, system reboot, and system shutdown messages to the syslog.

Default

The default setting is reset.

Usage Guidelines

Use this command for system auto-recovery upon detection of hardware problems. You can configure the MSMs/MMs or I/O modules installed in a modular switch to take no action, automatically reset, shutdown, or if dual MSMs/MMs are installed, failover to the other MSM/MM, if the switch detects a faulty MSM/MM or I/O module. This enhanced level of recovery detects faults in the ASICs as well as packet buses.

You must specify one of the following parameters for the system to respond to MSM/MM or I/O module failures:

  • none—Configures the MSM/MM or I/O module to maintain its current state regardless of the detected fault. The offending MSM/MM or I/O module is not reset. ExtremeXOS logs fault and error messages to the syslog and notifies you that the errors are ignored. This does not guarantee that the module remains operational; however, the switch does not reboot the module.
  • reset—Configures the offending MSM/MM or I/O module to reset upon fault detection. ExtremeXOS logs fault, error, system reset, and system reboot messages to the syslog.
  • shutdown—Configures the switch to shut down all slots/modules configured for shutdown upon fault detection. On the modules configured for shutdown, all ports in the slot are taken offline in response to the reported errors; however, the MSMs/MMs remain operational for debugging purposes only. You must save the configuration, using the save configuration command, for it to take effect. ExtremeXOS logs fault, error, system reset, system reboot, and system shutdown messages to the syslog.

Depending on your configuration, the switch resets the offending MSM/MM or I/O module if fault detection occurs. An offending MSM/MM is reset any number of times, and the MSM/MM is not permanently taken offline. On the BlackDiamond 8800 series switches, an offending I/O module is reset a maximum of five times. After the maximum number of resets, the I/O module is permanently taken offline.

Messages Displayed

If you configure the hardware recovery setting to either none (ignore) or shutdown, the switch prompts you to confirm this action. The following is a sample shutdown message:

 Are you sure you want to shutdown on errors? (y/n) 

Enter y to confirm this action and configure the hardware recovery level. Enter n or press [Enter] to cancel this action.

Taking Ports Offline

Beginning with ExtremeXOS 11.5, you can configure the switch to shut down one or more modules upon fault detection by specifying the shutdown option. If you configure one or more slots to shut down and the switch detects a hardware fault, all ports in all of the configured shut down slots are taken offline in response to the reported errors. (MSMs are available for debugging purposes only.)

The affected module remains in the shutdown state across additional reboots or power cycles until you explicitly clear the shutdown state. If a module enters the shutdown state, the module actually reboots and the show slot command displays the state of the slot as Initialized; however, the ports are shut down and taken offline. For more information about clearing the shutdown state, see the clear sys-recovery-level command.

Module Recovery Actions--BlackDiamond 8800 Series Switches Only

The following table describes the actions module recovery takes based on your module recovery setting. For example, if you configure a module recovery setting of reset for an I/O module, the module is reset a maximum of five times before it is taken permanently offline.

From left to right, the columns display the following information:
  • Module Recovery Setting—This is the parameter used by the configure sys-recovery-level slot command to distinguish the module recovery behavior.
  • Hardware—This indicates the hardware that you may have in your switch.
  • Action Taken—This describes the action the hardware takes based on the module recovery setting.
Click to expand in new window

Module Recovery Actions for the BlackDiamond X8 Series Switches and BlackDiamond 8800 Series Switches

Module Recovery Setting Hardware Action Taken
none    
  Single MSM

The MSM remains powered on in its current state.

This does not guarantee that the module remains operational; however, the switch does not reboot the module.

  Dual MSM

The MSM remains powered on in its current state.

This does not guarantee that the module remains operational; however, the switch does not reboot the module.

  I/O Module

The I/O module remains powered on in its current state. The switch sends error messages to the log and notifies you that the errors are ignored.

This does not guarantee that the module remains operational; however, the switch does not reboot the module.

reset    
  Single MSM Resets the MSM.
  Dual MSM Resets the primary MSM and fails over to the backup MSM.
  I/O Module Resets the I/O module a maximum of five times. After the fifth time, the I/O module is permanently taken offline.
shutdown    
  Single MSM The MSM is available for debugging purposes only (the I/O ports also go down); however, you must clear the shutdown state using the clear sys-recovery-level command for the MSM to become operational.

After you clear the shutdown state, you must reboot the switch.

For more information see the clear sys-recovery-level command.

  Dual MSM

The MSM is available for debugging purposes only (the I/O ports also go down); however, you must clear the shutdown state using the clear sys-recovery-level command for the MSM to become operational.

After you clear the shutdown state, you must reboot the switch.

For more information see the clear sys-recovery-level command.

  I/O Module

Reboots the I/O module. When the module comes up, the ports remain inactive because you must clear the shutdown state using the clear sys-recovery-level command for the I/O module to become operational.

After you clear the shutdown state, you must reset each affected I/O module or reboot the switch.

For more information see the clear sys-recovery-level command.

Displaying the Module Recovery Setting

To display the module recovery setting, use the show slot command.

Beginning with ExtremeXOS 11.5, the show slot output has been modified to include the shutdown configuration. If you configure the module recovery setting to shutdown, the output displays an “E” flag that indicates any errors detected on the slot disables all ports on the slot. The “E” flag appears only if you configure the module recovery setting to shutdown.

Note

Note

If you configure one or more slots for shut down and the switch detects a hardware fault on one of those slots, all of the configured slots enter the shutdown state and remain in that state until explicitly cleared.

If you configure the module recovery setting to none, the output displays an "e" flag that indicates no corrective actions will occur for the specified MSM/MM or I/O module. The "e" flag appears only if you configure the module recovery setting to none.

The following sample output displays the module recovery action. In this example, notice the flags identified for slot 2:

Slots    Type                 Configured           State       Ports  Flags
-------------------------------------------------------------------------------
Slot-1   8900-G96T-c          8900-G96T-c          Operational   96   MB
Slot-2   8900-10G24X-c        8900-10G24X-c        Operational   24   MB   E
Slot-3   8900-40G6X-xm        8900-40G6X-xm        Operational   24   MB
Slot-4   G48Xc                G48Xc                Operational   48   MB
Slot-5   G8Xc                 G8Xc                 Operational    8   MB
Slot-6                                             Empty          0
Slot-7   G48Te2(PoE)          G48Te2(PoE)          Operational   48   MB
Slot-8   G48Tc                G48Tc                Operational   48   MB
Slot-9   10G4Xc               10G4Xc               Operational    4   MB
Slot-10                                            Empty          0
MSM-A    8900-MSM128                               Operational    0
MSM-B    8900-MSM128                               Operational    0
Flags : M - Backplane link to Master is Active
B - Backplane link to Backup is also Active
D - Slot Disabled
I - Insufficient Power (refer to "show power budget")
e - Errors on slot will be ignored (no corrective action initiated)
E - Errors on slot will disable all ports on slot
Note

Note

In ExtremeXOS 11.4 and earlier, if you configure the module recovery setting to none, the output displays an "E" flag that indicates no corrective actions will occur for the specified MSM or I/O module. The "E" flag appears only if you configure the module recovery setting to none.

Displaying Detailed Module Recovery Information

To display the module recovery setting for a specific port on a module, including the current recovery mode, use the following command: show slot slot

In addition to the information displayed with show slot, this command displays the module recovery setting configured on the slot. The following truncated output displays the module recovery setting (displayed as Recovery Mode) for the specified slot:

Slot-2 information:
State:               Operational
Download %:          100
Flags:               MB   E
Restart count:       0 (limit 5)
Serial number:       800264-00-01 0907G-00166
Hw Module Type:      8900-10G24X-c
SW Version:          15.2.0.26
SW Build:            v1520b26
Configured Type:     8900-10G24X-c
Ports available:     24
Recovery Mode:       Shutdown
Debug Data:          Peer=Operational
Flags : M - Backplane link to Master is Active
B - Backplane link to Backup is also Active
D - Slot Disabled
I - Insufficient Power (refer to "show power budget")
e - Errors on slot will be ignored (no corrective action initiated)
E - Errors on slot will disable all ports on slot

Troubleshooting Module Failures

If you experience an I/O module failure, use the following troubleshooting methods when you can bring the switch offline to solve or learn more about the problem:
  • Restarting the I/O module—Use the disable slot slot command followed by the enable slot slot command to restart the offending I/O module. By issuing these commands, the I/O module and its associated fail counter is reset. If the module does not restart, or you continue to experience I/O module failure, please contact Extreme Networks Technical Support.
  • Running diagnostics—Use the run diagnostics normal slot command to run operational diagnostics on the offending I/O module to ensure that you are not experiencing a hardware issue. If the module continues to enter the failed state, please contact Extreme Networks Technical Support.

If you experience an MSM/MM failure, please contact Extreme Networks Technical Support.

Example

The following example configures a switch to not take an action if a hardware fault occurs:

configure sys-recovery-level slot none

History

This command was first available in ExtremeXOS 11.3.

The shutdown parameter was added in ExtremeXOS 11.5.

Platform Availability

This command is available on all platforms.