BGP Test Lab – Part 13 – BGP States

Any router that uses BGP must establish a peering (neighbour) with its neighbour routers before it can exchange any routing information. To establish this peering, the BGP router on each side of the connection must go through a number of stages called (BGP States) before it can complete a transfer of routes. A key point is that on each peering between two BGP routers, one is Active, and the other is Passive during the connection process; which essentially means both routers will be listening and attempting to make a connection to it’s neighbour, once it determines there is something at the other end that is listening, it then flows through these states until it reaches “Established”, assuming all is well and there are no connectivity or misconfiguration issues along the way. Only when the router is in “Established” state does any routing information get exchanged.

So how do these states actually flow? BGP uses what is known as an FSM (Finite State Machine), an FSM is a model of a machine, where that machine can be in exactly one of a finite number of states at any given time; i.e. it’s always in one of the states, and only one at a time.

State Flow Diagram

The state flow diagram shows the states and the flows of the BGP FSM. It’s a bit complicated, but we’ll dig into each state in the explanations below.

BGP States

The BGP states that every BGP router transitions through when making a neighbour peering.

Idle

The initial state of a BGP neighbour relationship. In this state BGP is waiting for a “start event”, one of these occurs when a new BGP Neighbour is configured or when an established BGP Neighbour peering is reset (either manually or by a loss of the link for example). 

In this state BGP gets a start event and will reset the ConnectRetry timer, then initiates a connection to the remote BGP Neighbour. While doing this it is also listening for a BGP connection, just in case a BGP Neighbour also tries to establish a connection to this BGP peer.

If either this router, or another is successful in its initial connection, then the state will move to the Connect state, otherwise if say the remote router did not respond, or the link failed during the attempt or there was some misconfiguration that caused this initial handshake to fail, it would return to the Idle state again; and wait to try again.

Connect

In the “connect” state BGP is waiting for a successful completion of the TCP three-way handshake. Its already started successfully, because it’s moved from the “Idle” to the “Connect” state, but it’s now still connecting, but waiting for the TCP three-way handshake.

If/when it is successful, it then moves to the “OpenSent” state, but if it fails, it goes to the “Active” state.

If the ConnectRetry timer expires, then it stays in this state, the ConnectRetry timer is reset, and BGP will initialise a new TCP three-way handshake. But if anything else happens such as the link dropping, then it starts over at the “Idle” state again.

Active

You’ll recall that the “Active” state was if the three-way handshake failed, so it attempts to try another TCP three-way handshake with the BGP neighbour. 

Same as the “Connect” state, if this TCP three-way handshake is successful this time, it moves to the “OpenSent” state, but if the ConnectRetry timer expires, it moves back to the “Connect” state.

During this state, BGP will also be listening for incoming connections from the remote BGP neighbour just in case it tries to make a connection.

But if anything else happens such as the link dropping, then it starts over at the “Idle” state again.

OpenSent

The “OpenSent” state is where the magic really starts to happen. BGP is waiting for the OPEN message from the remote BGP Neighbour. The OPEN message contains more detailed BGP session information, but here it is checked, if there is something amiss like the remote router is providing the wrong AS number (compared to what the local router has configured), then it will respond and then go back to the “Idle” state again. 

During the OpenSent state, BGP also decides if this is an eBGP or iBGP connection, i.e. is it talking to a router in a different AS, or a router in the same AS.

Assuming the connection (and session information) is okay, then BGP starts sending periodic keepalive messages (which reset the keepalive timer) and now it moves to the “OpenConfirm” state. Additionally in the “OpenSent” state, some negotiation takes place between the two BGP routers, for example the Hold Time.

If the TCP session were to fail, perhaps an intermittent/transient connection fault in between, BGP will then return to the “Active” state.

If anything else happens, then BGP sends a notification, then returns to the “Idle” state once again.

OpenConfirm

During “OpenConfirm” BGP is waiting for keepalive messages from the remote BGP router. Only once it gets a keepalive does it now move to “Established” state, when that happens the Hold Timer is reset. Its worth noting that the BGP Hold Timer is an important timer, its default is 180 seconds, three times the default 60 second keepalive interval. It’s a way for the BGP router to determine if the remote router has failed or not. If it’s not receiving, either a keepalive, an update or a notification, within that 180 seconds it declares that the router has failed. Note that its possible that the interface could be up, but not passing traffic; hence this timer would kick in, and when it expires downs the BGP neighbour peering. But if the interface actually goes down, this can happen more quickly.

If a notification message were to be received, then it moves immediately to the “Idle” state once more.

Established

We’re there! The BGP neighbour adjacency is complete, BGP routers will send periodic updates packets, they’ll exchange their routing information. Each time a keepalive message or an update message (with route information) is received the Hold Timer gets reset back to 0, then starts counting up again. If a notification message is received then it moves to the “Idle” state. Otherwise, the two routers will stay in the state indefinitely until something happens that means the connection is disrupted, and the state will fall back to “Idle” and the exchange will restart again.

One problem that can happen during this state, which was illuded to in the previous state description is identification of a dead router. Perhaps the remote router has failed, but the interface/link is still up, its just not doing anything any more. Here the Hold Timer is counting up, the problem here is that no routes change, even though the link can’t be used, because the remote router is not responding. Within a large network like the Internet, these odd states of no communication can occur, and if every time the littlest thing happened, route updates were triggered, routers across the Internet would be swamped with constant updates.

However, this absence of traffic flow, without an update can cause service unavailability issues in certain circumstances, because of this you can use things like BFD (Bidirectional Forwarding Detection), which we discuss in a later article, which can very rapidly (and safely) detect a problem like this and trigger the BGP behaviour to happen more quickly to remove the faulty route.

Example

Showing each of the states can be a bit tricky, because routers will transition quickly (when they do). But the screenshot below, shows Router A’s four BGP Neighbour peering. As you can see they are all in ESTABLISHED state, meaning everything is up, routes are being exchanged and the network is in a stable state.

If we example one of the BGP neighbours in more detail we can see some of the settings negotiated, such as the Hold Timer, which is 180 seconds.

We can also see the last error which was the peer’s interface was shut down, so the updates were not received, and it tore down the neighbour peering, and returned to the IDLE state.

I’ve disabled Port 2 on Router A, so I should be seeing the BGP Neighbour peer to Router C move to the IDLE state. It can take some time for this to happen (hence why BFD can be helpful).

If we examine what is going on in more detail with this neighbour, we can see (and this happened during taking the screenshot above and the one below) that the state has moved to ACTIVE, where it was previously CONNECT. And you can see the reason why the state changed was “Hold Timer Expired (TX)”. In this case the TX is interesting, that means that (from this router’s point of view), it was transmitting, but the neighbour was not responding. If it instead said RX, it would mean the neighbour perhaps received the keepalive message, but didn’t reply fast enough (i.e. within the timer), or something like one-way traffic is occurring, or packet loss, which is stopping the replies coming back.

Conclusion

We’ve reviewed all the BGP states, to see how the flow through the states work. Knowing these states how they relate and what kinds of things can be going wrong at each state can help you troubleshoot issues.

Essentially if your link is working fine, and then suddenly moves to IDLE state, it would mean things like a link failure, router failure or administrator changing configuration at the remote end, triggering a state change.

Leave a comment