Open Defects

The following defects are open in ExtremeCloud Orchestrator 3.3.1.

Parent Defect ID: XCO-3445 Issue ID: XCO-3445
Product: XCO Reported in Release: EFA 3.0.0
Symptom: DRC will not identify the drift and hence will not reconcile the drifted configuration
Condition:

Below are the steps to reproduce the issue:

1. Configure multi rack Non-CLOS fabric.

2. Manually remove the below set of configurations on device under

router-bgp

no neighbor 172.x.x.x password xxxx

no neighbor 172.x.x.x update-source loopback 1

no neighbor 172.x.x.x peer-group overlay-ebgp-group

address-family l2vpn evpn

no retain route-target all

3. Execute "efa inventory drift-reconcile execute --ip <device-ip>"

Recovery: Manually reconfigure the removed configurations from the device
Parent Defect ID: XCO-3471 Issue ID: XCO-3471
Product: XCO Reported in Release: EFA 3.1.0
Symptom: Stale BGP Peer-group entry configured under router BGP on SLX Border leaf and Spine devices with none of the BGP neighbors linked with the Peer group.
Condition:

1. Create a 3-stage CLOS fabric, add devices with MCT leaf, spine, and border-leaf and configure the fabric

2. Convert the 3-stage CLOS fabric to a 5-stage CLOS fabric using the fabric migrate command

"efa fabric migrate --type "3-to-5-stage" --source-fabric <source-fabric> --destination-3-stage-leaf-spine-pod <pod-name> --destination-3-stage-border-leaf-pod <pod-name>"

3. Add super-spine POD devices to the migrated 5-stage CLOS fabric

4. Disconnect the BorderLeaf to Spine links and reconnect the BorderLeaf to Super-Spine links

5. Configure the migrated 5-stage CLOS fabric

Recovery: Manually delete the stale BGP peer-groups from both the Border Leaf and Spine devices
Parent Defect ID: XCO-4127 Issue ID: XCO-4127
Product: XCO Reported in Release: EFA 3.0.0
Symptom: Ports are not listed in the port-channel creation for SLX NPB devices
Condition: Even though the ports are not used in any other configurations, the ports are not listed in the port-channel creation. For these ports, speed is set to auto-negotiation, and ports are not connected with cable.
Workaround: For breakout ports, make sure that cables are connected so that port speed will be updated.
Recovery: N/A
Parent Defect ID: XCO-4128 Issue ID: XCO-4128
Product: XCO Reported in Release: EFA 3.0.0
Symptom: Port-channel partial configuration are present on device for SLX NPB devices
Condition: Port-channel configuration failed from UI, on device still the partial configuration is present.
Workaround: Make sure that all the configuration information are correctly populated from UI so that configuration will not fail on device.
Recovery: Login to SLX CLI and delete the given port channel and select refresh configuration on XCO UI from the device action list.
Parent Defect ID: XCO-4129 Issue ID: XCO-4129
Product: XCO Reported in Release: EFA 3.0.0
Symptom: Disable of vn-tag header strip and enabling of 802.1BR header strip fails from XCO GUI for SLX NPB
Condition: When vn-tag header strip is enabled on an interface, disabling the vn-tag header strip and enabling the 802.1BR header strip in a single operation fails from XCO GUI.
Workaround: Disable the vn-tag header strip in first operation (save the port update) and then edit port again for enabling 802.1BR header strip option.
Recovery: NA
Parent Defect ID: XCO-4146 Issue ID: XCO-4146
Product: XCO Reported in Release: EFA 2.7.2
Symptom: The fabric devices continue to remain in cfg-refresh-err state after the tpvm fail over.
Condition:

1.Fabric devices are already in cfg-refresh-err state due to LLDP Link down(LD) event.

2. Bring up the LLDP links responsible for the fabric devices to be in cfg-refresh-err state.

3. Execute the TPVM failover by 'tpvm stop' and 'tpvm start' commands during the LLDP Link up (LA) event handling caused by 2.

Recovery:

1The user triggers LD/LA event by flapping the interface links which are the devices are in the cfg-refreshed state even though DRC wouldn't help out to recover the device to the cfg-sync state and the pending reason is "LA/LD".

1.1. "shutdown" the interface link on the physical link on Devices follow by "efa inventory device update --ip <device-ip>", which generates LD events

21.. "no shutdown" the interface link on the physical link on Devices follow by "efa inventory device update --ip <device-ip>", which generates LA events

1.3. If the pending config contains "LA" : Execute "efa inventory drift-reconcile execute --ip <device-ip> --reconcile" on the devices which are in cfg-refresh-err /cfg-refreshed state [or] IF the pending config contains "LD,LA" : Execute "efa fabric configure --name <fabric-name>" to clean up the configuration on devices which are in cfg-refresh-err /cfg-refreshed state.

[OR]

2. The user reboots the devices without maintenance mode which are the devices are in cfg-refreshed state even though DRC wouldn't help out to recover the device to the cfg-sync state.

2.1. "reload" the switches without out maintenance mode to enable

2.2. Execute "efa inventory drift-reconcile execute --ip <device-ip> --reconcile" on the devices which are in cfg-refresh-err /cfg-refreshed state.

Parent Defect ID: XCO-5263 Issue ID: XCO-5263
Product: XCO Reported in Release: EFA 3.1.0
Symptom: Failed to report telemetry not streamed from device
Condition: When SLX device is discovered from XCO and statistics are getting streamed using telemetry service. Device is not sending statistics using telemetry service.
Workaround: Select the individual device and verify the statistics are streaming from device.
Parent Defect ID: XCO-6172 Issue ID: XCO-6172
Product: XCO Reported in Release: XCO 3.2.0
Symptom: SLX currently doesn‘t support configuring both IPv4 and IPv6 DNS together. When both IPv4 and IPv6 DNS are configured during tpvm deployment, only one trusted peer config takes effect.
Workaround: It is recommended to use IPV4 DNS for XCO deployment.
Parent Defect ID: XCO-6189 Issue ID: XCO-6189
Product: XCO Reported in Release: XCO 3.2.0
Symptom: SLX currently doesn‘t support configuring both IPv4 and IPv6 trusted peers together. When both IPv4 and IPv6 trusted-peers are configured after tpvm deploy, only one trusted peer config takes effect.
Workaround: It is recommended to use IPV4 trusted peer for XCO deployment.
Parent Defect ID: XCO-6360 Issue ID: XCO-6360
Product: XCO Reported in Release: XCO 3.1.1
Symptom: Few important system logs are not seen in the XCO UI.
Condition: Device is discovered from XCO and some of the cards are removed or inserted in the device.
Parent Defect ID: XCO-7100 Issue ID: XCO-7100
Product: XCO Reported in Release: XCO 3.2.0
Symptom: "Target TPVM Version" Does not display until new TPVM is already installed.
Condition: During the TPVM upgrade, "Target TPVM Version" gets updated late in the workflow.
Recovery: The correct Target TPVM version gets updated after the new version is installed.
Parent Defect ID: XCO-7183 Issue ID: XCO-7183
Product: XCO Reported in Release: EFA 3.0.0
Symptom: After changing DNS nameservers in /etc/netplan and running the update-dns.sh --dns-action allow, the following error is seen:

(efa:ubuntu)ubuntu@efa:/opt/efa$ sudo /opt/efa/update-dns.sh

/opt/efa/update-dns.sh Usage:

--help - Show this message

--dns-action <'allow'|'disallow'> - Allow host DNS entries to be forwarded to the pods

(efa:ubuntu)ubuntu@efa:/opt/efa$ sudo /opt/efa/update-dns.sh --dns-action allow

Unexpected nameserver entry of 127.0.0.53 found in /etc/resolve.conf

(efa:ubuntu)ubuntu@efa:/opt/efa$

Condition: In 18.04.6 and 20.04, Ubuntu uses a stub-resolv.conf located in /run/systemd/resolve/stub-resolv.conf . This file is symlink to /etc/resolv.conf in /run/systemd/resolve/.

There is another file, resolv.conf which contains the information for DNS from netplan.

Additionally, systemd-resolved provides a local DNS stub listener on IP address 127.0.0.53 on the local loopback interface. Programs issuing DNS requests directly,bypassing any local API may be directed to this stub, in order to connect them to systemd-resolved.
Note: The best practice is for local programs to use the glibc NSS or bus APIs instead (as described above), as various network resolution concepts (such as link-local addressing, or LLMNR Unicode domains) cannot be mapped to the unicast DNS protocol.

We do not recognize the 127.0.0.53 address as valid.

Workaround: If updating DNS to allow host entries to be forwarded to the pods using the update-dns.sh script in XCO-3.3.0 on Ubuntu 20.0.4 or 18.0.4-6 or above, follow these steps.
After netplan is applied and before running update_dns.sh
  1. Check if symlink exists, if not directly edit /etc/resolv.conf to netplan ip:

    $ ls -l /etc/resolv.conf

    lrwxrwxrwx 1 root root 39 Feb 20 2021 /etc/resolv.conf -> ../run/systemd/resolve/stub-resolv.conf <<<symlink exists

  2. Check if it has 127.0.0.53 ip in the following files:

    $ cat /etc/resolv.conf | grep nameserver

    nameserver 127.0.0.53

    $ cat /run/systemd/resolve/stub-resolv.conf | grep nameserver

    nameserver 127.0.0.53

  3. Edit the following file to add netplan DNS ip for the nameserver and remove 127.0.0.53

    sudo vi /run/systemd/resolve/stub-resolv.conf

  4. Check if both files are updated

    $ cat /run/systemd/resolve/stub-resolv.conf | grep nameserver

    nameserver 10.10.10.0

    $ cat /etc/resolv.conf | grep nameserver

    nameserver 10.10.10.0

  5. Run as root ./update_dns.sh --dns-action allow
  6. Run sudo netplan apply to restore /etc/resolv.conf and /run/systemd/resolve/stub-resolv.conf to its default value of 127.0.0.53
Parent Defect ID: XCO-7426 Issue ID: XCO-7426
Product: XCO Reported in Release: EFA 2.7.0
Symptom: While performing `efa show-running-config` , application was not able to process inventory data from Device table.
Condition:

Old entry present in DB, with invalid IP_address.

Causing the issue.

Workaround: The only workaround here would be to remove this entry from dcapp_asset.device​ table.
Recovery:

Ensure that old devices are properly removed from inventory.

No old devices entry exists in Inventory.

No invalid device entry exists in device DB.

Parent Defect ID: XCO-7899 Issue ID: XCO-7899
Product: XCO Reported in Release: XCO 3.3.0
Symptom: BGP peer delete with MP-BGP support enabled for additional path advertise fails with netconf error - '%Error: 'additional-paths advertise' is configured, cannot remove 'additional-paths select' command'.
Condition: If the MP-BGP neighbor is associated to additional path select, then the deletion of the bgp neighbor fails with the following netconf error - '%Error: 'additional-paths advertise' is configured, cannot remove 'additional-paths select' command'
Workaround: There is no workaround for this issue
Recovery: Execute the peer delete command again and it gets deleted on the second attempt.
Parent Defect ID: XCO-7955 Issue ID: XCO-7955
Product: XCO Reported in Release: XCO 3.3.0
Symptom: When triggering the "Firmware Activate" process, it can lead to either parallel or serial execution, irrespective of the behavior of grouping devices for traffic loss. In cases where auto-commit is enabled, the activation can result in a "Firmware Commit Failed" status on the EFA end, even though the firmware commit has been successfully completed on the device end.
Condition:

The "Firmware Activate" process is initiated from the user interface, either through the Inventory Page or the Fabric-wide Page, even in the midst of an incomplete operation on a subset of devices.

For instance:

Device 1 and Device 2 trigger a download with auto-commit enabled from either the Inventory or Fabric-wide Page.

Device 3 triggers a download from the Fabric or Inventory Page.

Subsequently, Device 1 and Device 2 attempt to continue with the "Activate Download" operation from the inventory or fabric page, resulting in a "Firmware Commit Failed" failure.

Workaround: Do not initiate firmware upgrades on other devices until the device completes both the Activate operation and the commit operation.
Recovery: Based on the error in the flow sequences, use the following set of commands: "efa inventory debug unblock-from-fwdl" , "efa inventory device firmware-download" to continue with download operation
Parent Defect ID: XCO-8072 Issue ID: XCO-8072
Product: XCO Reported in Release: XCO 3.3.0
Symptom: When configuring an OOB QoS map of traffic-class-cos type without DP value provided on SLX, after device update, the entry showed up on EFA with DP value of 4.
Condition: When configuring an OOB QoS map of traffic-class-cos type without DP value provided on SLX, after device update, the entry showed up on EFA with DP value of 4.
Workaround: No workaround for this OOB entry.
Recovery: Delete this OOB entry from SLX device side.
Parent Defect ID: XCO-8170 Issue ID: XCO-8170
Product: XCO Reported in Release: XCO 3.2.1
Symptom: User is unable to login to XCO using LDAP authentication.
Condition: The XCO login fails after configuring LDAP on TPVM and XCO.
Workaround:

To authenticate using LDAP, set auth preference for LDAP to a higher value. For example: Set the preference to 1.

Below commands can be used

efa auth authentication preference show

efa auth authentication preference add --authType=LDAP --identifier ldap1 --preference 1

Parent Defect ID: XCO-8191 Issue ID: XCO-8191
Product: XCO Reported in Release: XCO 3.3.0
Symptom: If you run concurrent epg update commands operation as port-group-add or vrf-add on bridge-domain EPGs that are associated with more than one ctag, one or some of the commands can fail with error "Save for device failed".
Condition: This is observed more often when more than 3 concurrent EPG port-group-add commands with non-conflicting ports and non-overlapping ctag-range are executed. Occasionally, configuration information that is pushed by one command is not used properly to prepare command recipe for another, causing the failure of one command.
Workaround: Rerunning the failing command will succeed. The error is intermittent and does not cause permanent changes. XCO state information is not affected at any point.
Recovery: No recovery is required as no state change is done as part of this failure.
Parent Defect ID: XCO-8200 Issue ID: XCO-8200
Product: XCO Reported in Release: XCO 3.3.0
Symptom: SLX Devices are not allowed to execute the same firmware download execution flow, which could result in traffic loss. For example, it is not allowed to choose two Leaf devices from the same MCT pair.
Condition:

From User Interface, Go to the fabric page & select a few devices

Go to table action & select Firmware Upgrade Option

Workaround: The user selects the left-side leaf of the MCT pair and triggers firmware download and activation. Similarly, the user selects the right-side leaf of the MCT pair and triggers firmware download and activation.
Recovery: Choose another set of devices that will not result in traffic loss and proceed with the firmware download operation.
Parent Defect ID: XCO-8230 Issue ID: XCO-8230
Product: XCO Reported in Release: EFA 3.0.1
Symptom: When the user tries to import docker images after disk cleanup, the image import fails.
Condition: The k3s image import fails after disk cleanup.
Recovery:

Run the image import on Active TPVM.

Follow the below steps to recover from the above state

1. Clean up the disk space and restart all the services to run only with new instances.

Free up the disk space

# efactl clean

Reimport the images using:

# k3s ctr image import /opt/efa/docker_images/docker_k3s_images.tar

Restart EFA/k3s

# efactl restart

# systemctl restart k3s

Parent Defect ID: XCO-8232 Issue ID: XCO-8232
Product: XCO Reported in Release: XCO 3.3.0
Symptom:

Error is observed while updating EFA system CLI setting

Error : error creating directory on remote: Could not chdir to home directory /users/home21/<username>: No such file or directory

Condition: While using CLI "efa system settings update --remote-server-ip <ip> --remote-transfer-protocol scp --remote-server-username <username> --remote-server-password <password> --remote-server-directory <remote-server-directory>"
Workaround: Use Remote Server which has bash support installed.
Recovery: Add bash support and retry the CLI command.
Parent Defect ID: XCO-8234 Issue ID: XCO-8234
Product: XCO Reported in Release: XCO 3.3.0
Symptom: The fabric alarm and the alarm status update notifications can briefly reflect a small time window where the fabric alarm is cleared when it is actually unhealthy.
Condition: This can occur during fabric formation or during any operation where fabric health is degraded due to multiple reasons (example:- spine to leaf link going down, BGP neighborship going down between spine and leaf, etc...). When a specific device and links are repaired and deemed healthy, the overall fabric alarm can temporarily be cleared although other devices remain unhealthy. Then subsequently the fabric alarm will be corrected and put into an unhealthy state due to the remaining unhealthy devices.
Workaround: N/A
Recovery: The fabric alarm automatically recovers to the proper state. The fabric alarm can temporarily be cleared when it is actually not cleared yet.
Parent Defect ID: XCO-8366 Issue ID: XCO-8366
Product: XCO Reported in Release: XCO 3.3.1
Symptom: IPv6-Prefix over IPv4-Peer device setting under Inventory service gets refreshed and removed from the device when device is removed from fabric or when an entire fabric is deleted. This setting does not get applied automatically to the device when it is added back to the fabric or when the fabric is reconfigured.
Condition:
  1. Configure fabric.
  2. Enable IPv6-Prefix over IPv4-Peer device setting from inventory CLI.
  3. Remove the device from the fabric or delete an entire fabric.
  4. Add the device back in the fabric or re-configure the fabric.

Step #4 does not configure IPv6-Prefix over IPv4-Peer setting on a device. Inventory service keeps identifying drift for the same.

Recovery: Run DRC from Inventory service before or after adding device to fabric and reconfiguring fabric.