How Resilient Hashing Works

Describes the working principles of Resilient Hashing

The core of Resilient Hashing is a set of tables that maps the ECMP paths and the Flows together. Two tables are used. These tables are used to determine with path to use for packet transmission.

Path Table

The Path Table is a dynamic list of paths in the ECMP group that are available for use at any point of time. When a link becomes unavailable, its path entry is removed from this table. Similarly, when a path comes back up, it is added back to this table.

For example, a typical path table with eight (8) paths will be:

1 2 3 4 5 6 7 8

The same table when the paths 5 and 2 are lost.

1 3 4 6 7 8

Flowset Table

Note

Note

The examples used in this section are for illustrative purposes only. They do not represent the implementation in any hardware or software.

The Flowset Table is a table that has a configurable number of flows that are available for use to transmit a packet. This table is fixed in size and can be one of the following values: 64, 128, or 256 entries. The size of the flowset table depends on the configured max-path value.

The following is an example of a flow table of size 16 (4 rows X 4 columns). This flow table is populated with the path values. The number of times each path is populated within this table is calculated as <size of the flow table> / N where N is the number of paths available in the ECMP group. Here, each path is populated 16/4 = 4 times.

Table 1. Flowset Table With Paths
Index Path
0-3 0 0 0 0
4-7 1 1 1 1
8-11 2 2 2 2
12-15 3 3 3 3

The same Flowset Table after path one (1) is lost.

Table 2. Flowset Table With Paths No Longer Available For Use
Index Path
0-3 0 0 0 0
4-7 X X X X
8-11 2 2 2 2
12-15 3 3 3 3

Since the path 1 was deleted, the index values for this deleted path is populated with the remaining paths.

Table 3. Flowset Table With Paths Repopulated
Index Path
0-3 0 0 0 0
4-7 0 * 2 * 3 * 0 *
8-11 2 2 2 2
12-15 3 3 3 3
Note

Note

A re-populated path is indicated with a * symbol next to it. The repopulation of this path is determined by an internal algorithm that tends to assign paths so that the traffic is largely load balanced.

Calculating which path to use

For the purpose of explanation, we will use the following packet hash values.

Table 4. Packet Hash Table
Flow (Hash Value) 6401 6282 6579 6756 6973 7006 7015 7024 7045
Path index is calculated as <hash> modulo 16 (flow table size). 1 10 3 4 13 14 7 0 5
Note

Note

The number of ECMP paths are not taken into consideration when calculating the above index value. This value will remain same even if the number of available ECMP paths change.
Note

Note

In this example, 16 is used as it is the size of the example Flowtable.

When a link is lost

The following sections describes what happens when ECMP path 1 goes down.

In our example, all packets with indexes of between 4-7 will need to be re-routed. From the sample Packet Hash Table, we can see that packets with hash values 6756, 7015 and 7045 will be affected due to the route not being available.

Post Resilient Hashing, where new paths are updated to the Flowset Table, the packets will be routed as shown below:

Packet Hash Value Flowset Table Index Old Path New Path
6756 4 1 0
7015 7 1 3
7045 5 1 0

When a link comes back online

In this example, when the link one (1) which went down, is restored, the Flowset Table will become:

Table 5. Flowset Table With Restored Path
Index Path
0-3 0 0 0 0
4-7 1 1 1 1
8-11 2 2 2 2
12-15 3 3 3 3

The packets will be re-routed as follows:

Packet Hash Value Flowset Table Index Old Path New Path
6756 4 0 1
7015 7 3 1
7045 5 0 1

As seen, when a link is lost, only a subset of the packets are affected and there is no effect on the rest of the traffic through this device. Similarly, when a link is restored, only those packets that have the index of the restored link are affected.

Warning

Warning

Resilient Hashing is not supported in the scenario where a new ECMP peer is added. For example, when the existing number of ECMP peers is six (6) and is increased to seven (7). In this case, traffic will be disrupted as all prefixes will get reprogrammed in the hardware. This is true for both BGP and Static Route prefixes.