TRILL Data Center Solution

Data center networks have a number of unique requirements, such as:

High total network bandwidth capacity.
Hyper-fast point-to-point link speeds with low latency.
High number of high-speed access device (e.g., servers) network connections.
Multiple paths to reach every access device.
Flexibility to connect any device with any other set of devices.
Broadcast domain control to minimize network storms.

TRILL running on the BlackDiamond X8 core switch and the Summit X670, or X770, top-of-rack switch can solve these requirements. The following reference network diagram has been simplified. Typically the top-of-rack Summit X670 switches have four or eight up-links into multiple core switches (highlighted in magnified view at the bottom-right of the diagram). This reduces the number of hops and end-to-end latency and also offers increased resiliency.

Quad-core Data Center Reference Network

The first three challenges are met by deploying the BlackDiamond X8 and Summit X670 with 10G, 40G, and 100G Ethernet links in the DC. Multiple 40G links can be trunked together to form 160G or 320G core links. Each Summit X670 supports 48 front-panel 10G Ethernet links. Given the typical dual Ethernet connected server configuration, each Summit X670 provides core network access for 24 servers. Each BlackDiamond X8 supports 192 40G and 768 10G Ethernet ports. Scaling a fully meshed network core is limited by the (Node)2 link requirement. This introduces topology challenges that TRILL addresses.

A large flat Layer 2 network that allows any-to-any connectivity with lots of devices and high interconnect speeds may be implemented with a single VLAN domain. To prevent loops, Layer 2 protocols must be introduced that limit network link usability. TRILL retains the benefits of Layer 2 networks and adds the capabilities of IP Routing. This includes maintaining and building a complete link state network topology. TRILL also supports ECMP next-hop routing look up and packet forwarding operation. Similar to ISIS and OSPF, TRILL uses a modified Hello Protocol to discover neighbors and exchange capabilities information.

By combining the useful attributes of Layer 3 to the simplicity of Layer 2, TRILL addresses the Data Center core requirements better than either Layer 2-only or Layer 3-only designed networks.

Shortest Path Forwarding Example

TRILL uses the link state path computation, known as the Dijkstra Algorithm, to calculate the best path route based on link cost to every node in the network. Each node makes an independent decision on where to send a packet based on the packet‘s destination egress node. Given the quad-core network layout shown above, interconnect links have been added and associated link costs are shown in the figure above.

If a packet enters the network at node F and egresses the network at node H, the best path is F > G > H with a cost of 16. If the packet enters the network at node F and egresses at node N, the best path is F > I > K > N with a cost of 28. This means that multiple paths through the network are utilized.

Another advantage of using a link state algorithm to forward traffic is that multipath forwarding is also possible. Multipath forwarding allows the ingress node to forward packets along multiple paths to reach the destination as long as they are all considered to be the best path. Using the following diagram as an example, traffic that ingresses node I and egresses node L can follow I > A > B > J > L or I > K > C > D > L, since both have a link path cost of 42. The ingress node has two next-hop peers that can reach the egress node and may choose either path to send the packet. Packet reordering must be prevented, so the ingress node uses a hashing algorithm to select the next-hop peer. The hashing algorithm operates on the encapsulated packet header so that individual flows always follow the same path.

Edge ECMP Unicast Forwarding

As with IP Routing, each hop along the path performs its own next-hop look-up independent of the previous hops. This means that at each hop along the path, there may be multiple paths that were not available to the previous hops. This provides yet another level of load sharing not available to Layer 2 networks and as an aside, not supported in Service Provider Bridging (SPB). An example of this is shown in the following diagram. The ingress node is M and the egress node is B. There is only one shortest best path from M‘s perspective to reach B, and that‘s through the next-hop node of C. Once the TRILL packet reaches C, C performs its own look up to reach B and finds that there are two equal cost best paths: one through node A and the other through node D. C then performs a hash on the encapsulated packet header to choose either the next hop node of A or D. Thus, some flows from M to B take the path M->C->A->B and the some take the path M->C->D->B.

Intermediate Hop ECMP Unicast Forwarding

Note

With respect to ECMP TRILL forwarding, bi-directional packet flows may not take the same path. This is an artifact of the hash algorithm operating on encapsulated packet headers that are formatted differently and the specific hash algorithm implemented.

TRILL addresses the network scaling and data forwarding aspect of network access flexibility through a few key concepts. When TRILL is deployed in conjunction with Data Center virtualization and VLAN registration protocols, the network benefits of deploying VLANs can be realized while retaining the plug-and-play network access flexibility of using a single VLAN. Within the TRILL core, TRILL network VLANs are used to carry encapsulated access ethernet data traffic. The encapsulated packet‘s IEEE 802.1Q tag is carried across the TRILL network, extending a VLAN across the TRILL network. The TRILL packet's outer tag identifies the network VLAN and the encapsulated inner tag identifies the Access VLAN.

Logically, the Data Center network can be considered to have two independent sets of 4K VLANs: one set for the access devices and one set for the TRILL core network. Each TRILL node, or Route Bridge (RBridge), has a configured set of Access VLAN IDs that it provides traffic forwarding. To maintain full plug-and-play capability, the VLAN access list encompasses the entire 4K VLAN ID space. Native Ethernet tagged traffic received on a VLAN with a VLAN ID that matches an ID in the access tag space is encapsulated and forwarded across the TRILL network as shown in the following figure:

VLAN Interconnect Across TRILL Network

Extending Access VLANs across the TRILL core network means that there are potentially multiple access points into the core. This multipoint topology requires multicast forwarding rules to deliver flood packets to each access point. Layer 2 networks use MSTP to block ports such that one copy of each flood packet reaches every node for every VLAN. This solution has a number of deficiencies, including maintaining multiple spanning trees and requiring every flood packet on a VLAN to take the same path. TRILL uses multipath distribution trees, but only one tree is required to support all 4K Access VLANs. Additional TRILL multipath distribution trees can be deployed to improve flood packet link utilization in the core.

Note

Although TRILL supports this, multiple distribution trees are not supported in the initial release of TRILL.

Optionally, each RBridge can restrict forwarding of packets with VLAN tags to only those tree adjacencies that have downstream matching Access VLANs. This type of packet filtering eliminates unnecessary packet forwarding with in the TRILL core. Distribution trees are bi-directional and can be rooted at any node. This is referred to as VLAN pruning. The previous figure shows a TRILL network with VLAN X attached at RBridge nodes E, F, H, L, and M.

One potential general distribution tree is shown in the following figure. Distribution trees may be rooted at multiple RBridges. VLAN X access RBridges are colored green. In the example below, RBridge F is configured with the highest priority distribution tree and thus is used by all the RBridges in the TRILL network to forward flood and multicast traffic. All RBridges in the network must maintain the same topological view and be able to calculate the same distribution trees. For VLAN X, RBridges F, K, G, and L are not required to forward traffic to some or all of the distribution tree adjacencies. This effectively prunes the distribution tree and reduces packet replication and unnecessary traffic forwarding. Pruned RBridge nodes that will not receive VLAN X traffic are colored orange. If the distribution tree pruning is not employed by RBridges, the RBridge leaves must still discard any received traffic on VLAN X, provided no locally configured Access VLANs for VLAN X.

Logical Forwarding Tree Diagram

TRILL adds load sharing improvements on the access interfaces. VLANs may optionally be connected to multiple RBridges, as shown in the previous figure. The Designated RBridge determines which node provides forwarding access for each attached VLAN. RBridges providing packet forwarding are referred to as the appointed forwarders. The RBridge appointed forwarder is specified for each VLAN by the Designated RBridge. Various VLAN distribution algorithms can be employed. The result is that multiple RBridges can provide designated forwarding for a mutually exclusive set of shared Access VLANs. If one of the RBridges fails, one of the remaining active RBridges assumes the forwarding role as directed by the Designated RBridge as shown below:

RBridge Appointed Forwarder for Access VLAN

Published March 2020prev | next

Email this topic

Print this page Print this page

Leave feedback Feedback