The following defects are open in Extreme Fabric Automation 2.6.0.
Parent Defect ID: | EFA-8904 | Issue ID: | EFA-8904 |
---|---|---|---|
Severity: | S2 - Major | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.4.2 |
Symptom: | Single node deployment fails with 'DNS resolution failed.' | ||
Condition: | After a multi-node deployment and then un-deployment is done on a server, if single-node deployment is tried on the same server, the installer exits with the error, 'DNS resolution failed.' | ||
Workaround: | After un-deployment of the multi-node installation, perform a reboot of the server/TPVM. |
Parent Defect ID: | EFA-9065 | Issue ID: | EFA-9065 |
---|---|---|---|
Severity: | S2 - Major | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.4.3 |
Symptom: | EFA Port Channel remains in cfg-refreshed state when the port-channel is created immediately followed by the EPG create using that port-channel | ||
Condition: |
Below are the steps to reproduce the issue: 1. Create port-channel po1 under the ownership of tenant1 2. Create endpoint group with po1 under the ownership of tenant1 3. After step 2 begins and before step 2 completes, the raslog event w.r.t. step 1 i.e. port-channel creation is received. This Ralsog event is processed after step 2 is completed |
||
Recovery: |
1. Introduce switchport or switchport-mode drift on the SLX for the port-channel which is in cfg-refreshed state 2. Perform manual DRC to bring back the cfg-refreshed port-channel back to cfg-in-sync |
Parent Defect ID: | EFA-9439 | Issue ID: | EFA-9439 |
---|---|---|---|
Severity: | S2 - Major | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.0 |
Symptom: | Dev-State and App-State of EPG Networks are not-provisioned and cfg-ready | ||
Condition: |
Below are the steps to reproduce the issue: 1) Create VRF with local-asn 2) Create EPG using the VRF created in step 1 3) Take one of the SLX devices to administratively down state 4) Perform VRF Update "local-asn-add" to different local-asn than the one configured during step 1 5) Perform VRF Update "local-asn-add" to the same local-asn that is configured during step 1 6) Admin up the SLX device which was made administratively down in step 3 and wait for DRC to complete |
||
Workaround: | No workaround as such. | ||
Recovery: |
Following are the steps to recover: 1) Log in to SLX device which was made admin down and then up 2) Introduce local-asn configuration drift under "router bgp address-family ipv4 unicast" for the VRF 3) Execute DRC for the device |
Parent Defect ID: | EFA-9456 | Issue ID: | EFA-9456 |
---|---|---|---|
Severity: | S3 - Moderate | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.4.3 |
Symptom: | EFA fabric configuration fails on a large fabric topology of 30 switches. | ||
Condition: | The issue will be observed if devices being added to fabric have IP addresses assigned on interfaces and these IP addresses are already reserved by EFA for other devices. | ||
Workaround: |
Delete the IP addresses on interfaces of devices having conflicting configuration so that new IP addresses can be reserved for these devices. One way to clear the device configuration is using below commands: 1. Register the device with inventory efa inventory device register --ip <ip1, ip2> --username admin --password password 2. Issue debug clear "efa fabric debug clear-config --device <ip1, ip2>" |
||
Recovery: |
Delete the IP addresses on interfaces of devices having conflicting configuration so that new IP addresses can be reserved for these devices. One way to clear the device configuration is using below commands: 1. Register the device with inventory efa inventory device register --ip <ip1, ip2> --username admin --password password 2. Issue debug clear "efa fabric debug clear-config --device <ip1, ip2>" 3. Add the devices to fabric |
Parent Defect ID: | EFA-9570 | Issue ID: | EFA-9570 |
---|---|---|---|
Severity: | S2 - Major | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.0 |
Symptom: | Add Device Failed because ASN used in border leaf showing conflict | ||
Condition: | If there are more than one pair of Leaf/border leaf devices then devices which are getting added first will get the first available ASN in ascending order and in subsequent addition of devices if one of device is trying to allocate the same ASN because of brownfield scenario then EFA will throw an error of conflicting ASN | ||
Workaround: |
Add the devices to fabric in the following sequence 1)First add devices that have preconfigured configs 2)Add remaining devices that don't have any configs stored |
||
Recovery: |
Removing the devices and adding the devices again to fabric in following sequence 1)First add devices that have preconfigured configs 2)Add remaining unconfigured devices. |
Parent Defect ID: | EFA-9576 | Issue ID: | EFA-9576 |
---|---|---|---|
Severity: | S2 - Major | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.0 |
Symptom: | Deletion of the tenant by force followed by the recreation of the tenant and POs can result in the error "Po number <id> not available on the devices". | ||
Condition: |
Below are the steps to reproduce the issue: 1. Create tenant and PO. 2. Delete the tenant using the "force" option. 3. Recreate the tenant and recreate the PO in the short time window. |
||
Workaround: | Avoid performing tenant/PO create followed by tenant delete followed by the tenant and PO recreate in the short time window. | ||
Recovery: | Execute inventory device prior to the PO creation. |
Parent Defect ID: | EFA-9591 | Issue ID: | EFA-9591 |
---|---|---|---|
Severity: | S2 - Major | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.0 |
Symptom: | "efa fabric configure" fails with error after previously changing the fabric password in the configured fabric | ||
Condition: | This condition was seen when "efa fabric configure --name <fabric name>" was issued after modifying the MD5 password. Issue is observed when certain BGP sessions are not in an ESTABLISHED state after clearing the BGP sessions as part of fabric configure. | ||
Workaround: | Wait for BGP sessions to be ready by checking the status of BGP sessions using "efa fabric topology show underlay --name <fabric name>" | ||
Recovery: | Wait for BGP sessions to be ready. Check the status of BGP sessions using "efa fabric topology show underlay --name <fabric name>" |
Parent Defect ID: | EFA-9758 | Issue ID: | EFA-9758 |
---|---|---|---|
Severity: | S2 - Major | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.0 |
Symptom: | EFA is not reconciling the remote-asn of BGP peer configuration after the user modified the remote-asn of BGP peer out of band, | ||
Workaround: | None | ||
Recovery: | Revert the remote ASN of BGP peer on the device through SLX CLI to what EFA has configured previously. |
Parent Defect ID: | EFA-9799 | Issue ID: | EFA-9799 |
---|---|---|---|
Severity: | S3 - Moderate | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.0 |
Symptom: | 'efa status' response shows standby node status as 'UP' when node is still booting up | ||
Condition: | If SLX device is reloaded where EFA standby node resides, then 'efa status' command will still show the status of standby as UP. | ||
Workaround: | Retry the same command after some time. |
Parent Defect ID: | EFA-9874 | Issue ID: | EFA-9874 |
---|---|---|---|
Severity: | S3 - Moderate | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.0 |
Symptom: | When EPG is in the "anycast-ip-delete-pending" state and the user performs "epg configure", it will succeed without actually removing anycast-ip from SLX. | ||
Condition: |
Below are the steps to reproduce the issue: 1) Configure EPG with VRF, VLAN and anycast-ip (ipv4/ipv6) on a single rack Non-CLOS fabric. 2) Bring one of the devices to admin-down. 3) EPG Update anycast-ip-delete for anycast-ip ipv4 or ipv6. This will put EPG in "anycast-ip-delete-pending" state. 4) Bring the admin-down device to admin-up. 5) In this state, the only allowed operations on EPG are "epg configure" and EPG update "anycast-ip-delete". 6) Perform "epg configure --name <epg-name> --tenant <tenant-name>". |
||
Workaround: | No workaround. | ||
Recovery: | Perform the same anycast-ip-delete operation when both devices are admin-up. |
Parent Defect ID: | EFA-9907 | Issue ID: | EFA-9907 |
---|---|---|---|
Severity: | S2 - Major | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.0 |
Symptom: | When concurrent EFA tenant EPG update port-add or port-delete operation is requested where the commands involve large number of vlans and/or ports, one of them could fail with the error "vni in use error". | ||
Condition: | The failure is reported when Tenant service gets stale information about a network that existed a while back but not now. This happens only when the port-add and port-delete are done in quick succession | ||
Workaround: | Avoid executing port-add and port-delete of same ports in quick succession and in concurrence. | ||
Recovery: | None |
Parent Defect ID: | EFA-10048 | Issue ID: | EFA-10048 |
---|---|---|---|
Severity: | S2 - Major | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.0 |
Symptom: |
EPG: epgev10 Save for devices failed When concurrent EFA tenant EPG create or update operation is requested where the commands involve large number of vlans and/or ports, one of them could fail with the error "EPG: <epg-name> Save for devices Failed". |
||
Condition: | The failure is reported when concurrent DB write operation are done by EFA Tenant service as part of the command execution. | ||
Workaround: | This is a transient error and there is no workaround. | ||
Recovery: | The failing command can be rerun separately and it will succeed. |
Parent Defect ID: | EFA-10062 | Issue ID: | EFA-10062 |
---|---|---|---|
Severity: | S3 - Moderate | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.0 |
Symptom: | Removing a device from Inventory does not clean up breakout configuration on interfaces that are part of port channels. | ||
Condition: | This condition occurs when there is breakout configuration present on device that is being deleted from Inventory, such that those breakout configurations are on interfaces that are part of port-channels | ||
Workaround: | Manually remove the breakout configuration, if required. | ||
Recovery: | Manually remove the breakout configuration, if required. |
Parent Defect ID: | EFA-10063 | Issue ID: | EFA-10063 |
---|---|---|---|
Severity: | S3 - Moderate | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.0 |
Symptom: | Deleting device from EFA Inventory does not bring up the interface to admin state 'up' after unconfiguring breakout configuration | ||
Condition: | This condition occurs when there is a breakout configuration present on the device that is being deleted from EFA Inventory | ||
Workaround: | Manually bring the admin-state up on the interface, if required | ||
Recovery: | Manually bring the admin-state up on the interface, if required |
Parent Defect ID: | EFA-10093 | Issue ID: | EFA-10093 |
---|---|---|---|
Severity: | S3 - Moderate | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.0 |
Symptom: | Deletion of the VLAN/BD based L3 EPGs in epg-delete-pending state will result in creation and then deletion of the VLAN/BD on the admin up device where the VLAN/BD was already removed | ||
Condition: |
Issue occurs with the below steps: 1. Create L3 EPG with VLAN/BD X on an MCT pair 2. Admin down one of the devices of the MCT pair 3. Delete the L3 EPG. This results in the L3 configuration removal (corresponding to the L3 EPG getting deleted) from the admin up device and no config changes happen on the admin down device and the EPG transits to epg-delete-pending state 4. Admin up the device which was made admin down in step 2 5. Delete the L3 EPG which transited to epg-delete-pending state in step 3 |
||
Recovery: | Not needed |
Parent Defect ID: | EFA-10252 | Issue ID: | EFA-10252 |
---|---|---|---|
Severity: | S2 - Major | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.1 |
Symptom: | When concurrent EFA tenant EPG update port-group-add operations are requested where the tenant is bridge-domain enabled, one of them may fail with the error "EPG network-property delete failed" | ||
Condition: | The failure is reported when concurrent resource allocations by EFA Tenant service as part of the command execution. | ||
Workaround: | This is a transient error and there is no workaround. | ||
Recovery: | The failing command can be rerun separately and it will succeed |
Parent Defect ID: | EFA-10268 | Issue ID: | EFA-10268 |
---|---|---|---|
Severity: | S2 - Major | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.1 |
Symptom: | When concurrent EPG deletes on bd-enabled tenant are requested where the EPGs involve large number of vlans, local-ip and anycast-ip addresses, one of them may fail with the error "EPG: <epg-name> Save for Vlan Records save Failed". | ||
Condition: | The failure is reported when concurrent DB write operation are done by EFA Tenant service as part of the command execution. | ||
Workaround: | This is a transient error and there is no workaround. The failing command can be executed once again and it will succeed. | ||
Recovery: | The failing command can be rerun separately and it will succeed. |
Parent Defect ID: | EFA-10288 | Issue ID: | EFA-10288 |
---|---|---|---|
Severity: | S2 - Major | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.1 |
Symptom: | When a bgp peer is created and update operations are performed when one of the devices are in admin down state, the configuration for the admin up device is deleted from the slx switch but remains in efa when "efa tenant service bgp peer configure --name <name> --tenant <tenant>" is performed. | ||
Condition: |
The bgp peer gets deleted from the SLX but not from EFA. This issue is seen when the following sequence is performed. 1. Create static bgp peer 2. Admin down one of the devices 3. Update the existing bgp static peer by adding a new peer 4. Update the existing bgp static peer by deleting the peers which were first created in step1. Delete from both devices 5. Admin up the device 6. efa tenant service bgp peer configure --name "bgp-name" --tenant "tenant-name" Once the bgp peer is configured, the config is deleted from the switch for the device which is in admin up state whereas EFA still has this information and displays it during bgp peer show |
||
Workaround: | Delete the peer for admin up device first and then delete the peer from admin down device as a separate cli command. | ||
Recovery: | Perform a drift reconcile operation for the admin up device so that the configuration gets reconciled on the switch. |
Parent Defect ID: | EFA-10445 | Issue ID: | EFA-10445 |
---|---|---|---|
Severity: | S2 - Major | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.0 |
Symptom: | Tenant service may occasionally reject subsequent local-ip-add command incorrectly. | ||
Condition: | When continuous EPG updates with repeated local-ip-add and local-ip-delete operations are done on the same EPG repeatedly without much gap in-between, Tenant service may occasionally retain stale information about the previously created IP configuration and may reject subsequent local-ip-add command incorrectly. | ||
Workaround: | There is no work-around to avoid this. Once the issue is hit, user may use a new local-ip-address from another subnet. | ||
Recovery: |
Follow the steps below to remove the stale IP address from Tenant's knowledge base: 1. Find the management IP for the impacted devices. this is displayed in the EFA error message 2. Find the interface VE number. This is same as the CTAG number that the user was trying to associate the local-ip with 3. Telnet/SSH to the device management IP and login with admin privilege 4. Set the local IP address in the device configure t interface ve <number> ip address <local-ip> 5. Do EFA device update by executing 'efa inventory device update --ip <IP> and wait for a minute for the information to be synchronized with Tenant service database 6. Reset the local IP address in the device configure t interface ve <number> no ip address 7. Do EFA device update and wait for a minute for the information to be synchronized with Tenant service database These steps will remove the stale entries and allow future local-ip-add operation to be successful. |
Parent Defect ID: | EFA-10455 | Issue ID: | EFA-10455 |
---|---|---|---|
Severity: | S3 - Moderate | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.1 |
Symptom: | "efa status" takes several minutes longer than expected to report a healthy EFA status. | ||
Condition: | This problem happens when kubernetes is slow to update the standby node's Ready status. This is a potential issue in the shipped version of kubernetes. | ||
Recovery: | EFA will recover after a period of several minutes. |
Parent Defect ID: | EFA-10548 | Issue ID: | EFA-10548 |
---|---|---|---|
Severity: | S2 - Major | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.2 |
Symptom: | When EPG delete operations are done concurrently for EPGs that are on bridge-domain based tenant where the EPG was created with more number of bridge-domains, one of the command may fail with the error "EPG: <epg name> Update for pw-rofile Record save Failed". | ||
Condition: | The failure is reported when concurrent DB write operation are done by EFA Tenant service as part of the command execution causing the underlying database to report error for one of operation. | ||
Workaround: | This is a transient error that can rarely happen and there is no workaround. | ||
Recovery: | The failing command can be rerun separately and it will succeed. |
Parent Defect ID: | EFA-10754 | Issue ID: | EFA-10754 |
---|---|---|---|
Severity: | S3 - Moderate | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.2 |
Symptom: | EFA - Backup create fails (timeout) | ||
Condition: |
The device is stuck with the service lock taken as noted in the example inventory log message. This will happen when performing an EFA backup if the backup is performed near the expiration time of the authentication token. {"@time":"2021-10-13T16:19:53.132404 CEST","App":"inventory","level":"info","msg":"executeCBCR: device '21.150.150.201' is already Locked with reason : configbackup ","rqId":"4f144a0c-7be6-4056-8371-f1dc39eb28b3"} |
||
Recovery: | efa inventory debug devices-unlock --ip 21.150.150.201" will resolve the issue and backup can be done after efa login. |
Parent Defect ID: | EFA-11063 | Issue ID: | EFA-11063 |
---|---|---|---|
Severity: | S3 - Moderate | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.4 |
Symptom: | The standby status of the EFA node shows as down when actually the node is ready for failover | ||
Condition: | The issue happened because one of the pods - rabbitmq was in Crashloopbackoff instead of init mode. This is not a functional issue since its just a status issue. | ||
Workaround: | Reboot the standby - which doesn't cause any down time. Another workaround is to restart k3s using systemctl restart k3s command. | ||
Recovery: | Rebooting the node will fix the pods or restarting k3s will fix the issue |
Parent Defect ID: | EFA-11105 | Issue ID: | EFA-11105 |
---|---|---|---|
Severity: | S3 - Moderate | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.4 |
Symptom: | EFA tenant VRF and EPG show "App State: cfg-refresh-err" after a VRF change is made directly on SLX. | ||
Condition: |
Following are the steps to reproduce- Step1) Introduce VRF drift on SLX device by removing "vrf-forwarding" from VE Interfaces associated with the given VRF Step2) Perform "efa inventory device update" for the SLX device where VRF is instantiated Step3) Perform any VRF Update operation Step4) Perform DRC for the same SLX device where VRF is instantiated |
||
Workaround: | No workaround | ||
Recovery: |
Step1) Remove VRF from the EndpointGroups to which it belongs by using EPG Update "vrf-delete" Step2) Add VRF to all the EndpointGroups again by using EPG Update "vrf-add" |
Parent Defect ID: | EFA-11177 | Issue ID: | EFA-11177 |
---|---|---|---|
Severity: | S3 - Moderate | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.4 |
Symptom: | When a tenant with EPGs having 4000+ VLANs across 10+ devices, is deleted with the 'force' option, the delete operation may fail. | ||
Condition: | This failure happens because Tenant service executes a large database query line which may fail to execute by EFA's database backend. | ||
Workaround: | Delete the EPGs belonging to the tenant first and then delete the tenant. This will ensure that the database query lines are split across these multiple request. | ||
Recovery: | There is no recovery required. This failure does not lead to inconsistency of EFA's database or the SLX device's configurations. |
Parent Defect ID: | EFA-11768 | Issue ID: | EFA-11768 |
---|---|---|---|
Severity: | S2 - Major | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.6.0 |
Symptom: | This issue was seen when the user tried to delete the devices from the fabric. The bgp peer groups associated to the devices were not removed from the switch. | ||
Condition: |
Initiating a device clean up using the following command does not clean up the associated bgp peer groups from the device. efa fabric device remove --ip 10.20.48.161-162,10.20.48.128-129,10.20.54.83,10.20.61.92-93,10.20.48.135-136 --name fabric2 --no-device-cleanup |
||
Workaround: | Delete the bgp peer group before issuing a device clean up for the fabric. | ||
Recovery: | Manually delete the peer groups from the switch. |
Parent Defect ID: | EFA-11779 | Issue ID: | EFA-11779 |
---|---|---|---|
Severity: | S2 - Major | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.4 |
Symptom: | EFA installation or upgrade procedure abruptly exits without any error. | ||
Condition: | Security Hardening script, '/opt/security/extr-granite.py' has been run on the system prior to upgrade. | ||
Workaround: |
Before running EFA upgrade, 1. Edit /etc/ssh/sshd_config, remove the following entries ClientAliveInterval 300 ClientAliveCountMax 0 2. Restart ssh service. |
Parent Defect ID: | EFA-11813 | Issue ID: | EFA-11813 |
---|---|---|---|
Severity: | S3 - Moderate | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.6.0 |
Symptom: |
This issue can be seen for a bgp peer or peer group when update-peer-delete or delete operations are performed with one device for the mct pair in admin down state. The bgp peer gets deleted from the SLX but not from EFA. |
||
Condition: |
Steps to reproduce: 1. Create static bgp peer 2. Admin down one of the devices 3. Update the existing bgp static peer by deleting the peers which were first created in step1. Delete from both devices 4. Admin up the device Once the device is brought up, auto drc kicks in and the config which is deleted from the switch due to admin down state has an incorrect provisioning-state and app-state. |
||
Workaround: | Bring the admin down device up and then delete the required bgp peers. | ||
Recovery: | No recovery |
Parent Defect ID: | EFA-11980 | Issue ID: | EFA-11980 |
---|---|---|---|
Severity: | S3 - Moderate | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.4 |
Symptom: | An EFA TPVM upgrade workflow may fail for a given device along with the automatic recovery to restore the TPVM back to the original version and rejoin the EFA node back into the HA cluster. | ||
Condition: |
During the "EFA Deploy Peer and Rejoin" step, the EFA image import into the k3s container runtime fails. During the "TPVM Revert" step, the k3s on the active EFA node would not allow the standby EFA node to join the cluster due to a stale node-password in k3s. |
||
Workaround: | None | ||
Recovery: |
Manually recover the TPVM and EFA deployment by following the procedure described in the link below: EFA 2.5.2 Re-deploy post a TPVM Rollback failed on first attempt. https://extremeportal.force.com/ExtrArticleDetail?an=000099582 |
Parent Defect ID: | EFA-11992 | Issue ID: | EFA-11992 |
---|---|---|---|
Severity: | S2 - Major | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.6.0 |
Symptom: | When a device is deleted from inventory, the corresponding route-maps are not removed from the specified device for any route-maps that have active BGP peer bindings. | ||
Condition: | Issue will be seen when user removes the device from inventory and the device has route-map configurations with active bindings | ||
Workaround: | The user must remove the route-maps from the device manually prior to device deletion. | ||
Recovery: | After the device is removed from inventory user can remove the route-map configuration on that device manually. |
Parent Defect ID: | EFA-12033 | Issue ID: | EFA-12033 |
---|---|---|---|
Severity: | S2 - Major | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.6.0 |
Symptom: | Using EFA CLI, the user is able to delete non-EFA managed/OOB (out of band) route-map entries and add rules to non-EFA managed/OOB (out of band) prefix-list. | ||
Condition: | The user configures some OOB route-map or prefix-list entry on the device directly using SLX CLI/other management means and then tries to delete this route-map entry or add rules under this prefix list entry using EFA. This shouldn't be allowed from EFA as they are not EFA managed entities | ||
Workaround: | No workaround | ||
Recovery: | If user deletes the OOB entry or adds rules under OOB prefix-list by mistake it can be added back or removed manually on the device through SLX CLI/other management means. |
Parent Defect ID: | EFA-12058 | Issue ID: | EFA-12058 |
---|---|---|---|
Severity: | S3 - Moderate | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.6.0 |
Symptom: | The error 'Error updating traefik with efasecret' is seen during node replacement. | ||
Condition: | EFA node replacement is successful. | ||
Workaround: | Re-add subinterfaces using 'efa mgmt subinterfaces' CLI. |
Parent Defect ID: | EFA-12105 | Issue ID: | EFA-12105 |
---|---|---|---|
Severity: | S2 - Major | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.6.0 |
Symptom: | A "Drift Reconcile Completion Status Failure" may occur during an EFA firmware download of an SLX device in a fabric. | ||
Condition: |
A DRC status failure can occur if the SLX device also fails during the firmware download. The DRC failure is observed during the drift-reconcile completion step on either the spine node that is hosting the active EFA node TPVM or any device in the same firmware download group which is concurrently running the firmware download workflow at the time of HA failover. This is likely due to the SLX device rebooting and activating the new firmware. During the EFA HA failover, the REST endpoint for the go-inventory service is not established properly and causes the drift-reconcile process to fail. |
||
Workaround: | None | ||
Recovery: | Run "efa inventory drift-reconcile execute --ip <SLX device IP address> --reconcile" to retry the drift-reconcile process on the failed device. |
Parent Defect ID: | EFA-12114 | Issue ID: | EFA-12114 |
---|---|---|---|
Severity: | S3 - Moderate | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.5.4 |
Symptom: | In rare circumstances, kubernetes' EndpointSliceController can fall out of sync leading to incorrect iptables rules being instantiated. This can lead to EFA APIs failing because they are redirected to non-existent services. | ||
Recovery: |
EFA's monitor process will detect and attempt to remediate this situation automatically. If it fails to do so, the following can help: On both TPVMs, as the super-user, $ systemctl restart k3s If the problem recurs, these further steps, run as super-user, may help: $ sed -i -E 's/EndpointSlice=true/EndpointSlice=false/' /lib/systemd/system/k3s.service $ systemctl daemon-reload $ systemctl restart k3s |
Parent Defect ID: | EFA-12141 | Issue ID: | EFA-12141 |
---|---|---|---|
Severity: | S3 - Moderate | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.6.0 |
Symptom: | After EFA backup and restore, drifted route maps could be shown as cfg-in-sync state. | ||
Condition: | Issue could be seen after EFA backup and restore, if prefix lists and route maps are removed by EFA after backup. | ||
Workaround: | There is no workaround. It is a display issue. | ||
Recovery: | If a drift is present on device, running the 'efa inventory drift-reconcile' command will reconcile the entities on the device. |
Parent Defect ID: | EFA-12154 | Issue ID: | EFA-12154 |
---|---|---|---|
Severity: | S2 - Major | ||
Product: | Extreme Fabric Automation | Reported in Release: | EFA 2.6.0 |
Symptom: | A firmware download can fail with "Firmware Download Failed" status. | ||
Condition: |
1) The current SLX firmware version on the devices which are being upgraded must be 20.2.3c, 20.2.3d, 20.2.3e, and 20.3.4. 2) The --noAutoCommit flag is specified for firmware download execution. 3) Any device that is in the same firmware download group with the device hosting the active EFA node, can encounter the firmware download failure.
The firmware download failure occurs when the active EFA node is reloaded to activate the new firmware while the other device is in the middle of an SLX firmware download. The HA failover will cause the firmware download workflows to be restarted at the last completed step. Since the SLX firmware download did not complete, the SLX firmware download command will be issued to the device again. The SLX firmware 20.2.3c through 20.3.4 returns an error stating that it "Cannot start download before the new image is committed." |
||
Workaround: | Prepare a group list that contains only active EFA nodes and execute at the end of the SLX upgrade cycle |