BGP Fast Convergence with Delayed Route Calculation

BGP delayed route calculation delays the BGP BEST-path selection until BGP receives the route update information from its RIB-IN peers. This calculation minimizes the number of times that the BGP BEST-path decision process runs and improves the efficiency of the route updates (RIB-OUT) computation.

Delayed route calculation overview

By default, the BGP process accepts multiple incoming routing updates, computes the BEST-path selection, and advertises this selection to its peers immediately. When a BGP router reloads and comes up, BGP does not possess all of the route update information from its RIB-IN peers before it starts its BEST-path selection. Therefore, BGP processes incoming updates and computes the BEST-path several times. The scale of the routing updates from multiple RIB-IN peers makes BGP inefficient in calculating the BEST-path and generating RIB-OUT. Also, complex route policies of prefix lists or route-map filters for RIB-OUT peers add significant delay in generating RIB-OUT.

BGP delayed route calculation improves BGP convergence in this BGP protocol operation. A BGP peer starts propagating its route updates after it sends an initial explicit keepalive message to indicate that the session is up (ESTABLISHED). Until the peer is done propagating its route updates, most implementations do not send a second explicit keepalive message. The duration between the initial keepalive message and the second keepalive message is the duration in which the peer is propagating its bulk route updates. Based on certain heuristics and events, the BGP process can delay its BEST-path selection. In this delay (or learning) phase, BGP does not make BEST-path decisions, does not install routes in the RIB or hardware, and does not generate RIB-OUT.

BGP peer learning phase

When a BGP router comes up after a reload, each BGP peer is placed in learning phase after receiving the first keepalive from the peer, which the peer sends when the peer session transitions from OPEN_CONFIRM to ESTABLISHED.

A peer that is in learning phase is denoted by the notation “$” concatenated to the “ESTAB” state (for example, the show ip/ipv6 bgp summary command displays ESTAB$).

A peer that comes up as ESTABLISHED is placed in learning phase only if the VRF and Subsequent Address Family Identifier (SAFI) instance is in read-only mode. The read-only mode timer for a VRF and SAFI instance starts when the first peer comes up in that instance, according to the delay settings.
Table 1. BEST path selection delay
Setting Description

min-delay

The minimum time that a peer spends in read-only mode and by which the BGP-BEST path selection is delayed. The default is 180 seconds.

This time allows all BGP peers in the VRF and SAFI instance to come up and participate in BGP read-only mode.

max-delay

The maximum time that a peer spends in read-only mode and by which the BGP-BEST path selection is delayed.

msg-idle-time

The number of seconds to wait for an update from a peer before moving the peer out of the learning phase. The default is 2 seconds.

The key to fast convergence is detecting the end of the learning phase for each peer as early as possible. A peer can be detected and moved out of learning phase based on the following events. When all of the peers in the VRF and SAFI instance have completed the learning phase, the route calculation is scheduled immediately for this instance.
Table 2. End-of-learning-phase events
Event Description

Second keepalive is received

Probed when the keepalive is received for a peer.

One of the following:
  • No update message from the peer
  • Time difference between subsequent update message is greater than the message idle time
  • Probed when the update message is received from the peer.
  • Probed in the periodic timer, every 5 seconds.

Minimum time in read-only mode

The BGP-BEST path selection is delayed at least by the minimum time. This time allows all BGP peers in the instance to participate in BGP read-only mode.

Maximum time in read-only mode

  • A peer can continuously send route updates. Leaving such a situation forcefully ends the learning phase for a peer.
  • A timer starts when the first peer in the VRF and SAFI instance is placed in the learning phase. When the timer starts, 300 seconds are allowed (by default) for all peers to complete their learning phase.
  • Probed in a periodic timer every 5 seconds. Peers that are still in the learning phase after 300 seconds of the learning phase start time are forcefully moved out of learning phase.

A peer flaps, is removed through configuration, or is shutdown (admin disable)

The learning phase for the peer is forcefully ended.

At each probing point (processing keepalive, processing updates, periodic timer, and peer session state change), if it is detected that all BGP peers in the VRF and SAFI instance have completed their learning phase, BGP BEST-PATH selection is immediately scheduled for the instance.