Skip to content

Bgp: add Adj-RIB-In support and route withdrawal failover#1110

Open
giovanninardini wants to merge 12 commits into
inet-framework:masterfrom
giovanninardini:topic/bgp-adjribin
Open

Bgp: add Adj-RIB-In support and route withdrawal failover#1110
giovanninardini wants to merge 12 commits into
inet-framework:masterfrom
giovanninardini:topic/bgp-adjribin

Conversation

@giovanninardini

@giovanninardini giovanninardini commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR adds Adj-RIB-In based route tracking to the BGP implementation so routes learned from peers are retained separately from the selected Loc-RIB route. When a peer withdraws a prefix or a BGP session fails, the router now removes only the affected learned paths, reruns best-path selection for those prefixes, and either installs an alternate route or advertises a withdrawal.

Details

  • Add an Adj-RIB-In table to BgpRouter for all peer-learned routes.
  • Rework UPDATE processing to:
    • handle multiple IPv4 NLRI entries,
    • handle IPv6 MP_REACH_NLRI and MP_UNREACH_NLRI,
    • track affected prefixes,
    • rerun best-path selection only for changed prefixes.
  • Add route source metadata to BGP route entries, including source session and peer address.
  • Add explicit BGP withdrawal sending for IPv4 and IPv6.
  • Withdraw routes learned from a session when established sessions fail or timers expire.
  • Guard UPDATE, KEEPALIVE, and WITHDRAW sends when the TCP socket is no longer connected.
  • Stop BGP session timers when the FSM transitions back to Idle.
  • Restart BGP after lifecycle startup through the timer path so interfaces can come back up first.
  • Add a lifecycle stage for network interface configuration and move IPv6 interface reconfiguration before IPv6 address assignment.
  • Bind outgoing BGP TCP sockets to the configured local BGP address.
  • Ignore stale TCP messages for sockets that have already been replaced after restart/failure.

Examples

Adds two new diamond-topology examples that demonstrate failover, failback, and a second failure:

  • examples/bgpv4/BgpDiamondFailover
  • examples/bgpv4/BgpDiamondFailover6

Both examples prefer the A-B-C path initially, fail over to A-D-C when router B goes down, fail back after B restarts, and then exercise failover again after a crash.

Testing

  • Updated existing BGP fingerprints affected by the new route-selection behavior.
  • Added fingerprint coverage for the new IPv4 and IPv6 diamond failover examples.

Open in Devin Review

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 4 potential issues.

Open in Devin Review

Comment on lines +37 to +38
for (auto& elem : adjRibInTable)
delete elem;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Loc-RIB route entries leak on router destruction because they are never freed

The Loc-RIB entries are never freed in the destructor (BgpRouter::~BgpRouter at src/inet/routing/bgpv4/BgpRouter.cc:32-44) despite being separately-allocated clones, so every BGP route installed during the simulation leaks on shutdown or crash.

Impact: Memory leaks accumulate on each router lifecycle (shutdown/crash/restart), worsening in long-running simulations.

Loc-RIB ownership changed from shared to exclusive without matching destructor update

In the old code, bgpRoutingTable entries and rt (IP routing table) entries were the SAME object — rt->deleteRoute() freed them, so the destructor didn't need to. The PR changes this: installBestRouteForPrefix at src/inet/routing/bgpv4/BgpRouter.cc:973 creates a clone for the Loc-RIB (selected = cloneRoute(best)) and a SEPARATE clone for rt (routeForRt = cloneRoute(selected) at line 992). The Loc-RIB clone is pushed into bgpRoutingTable at line 983 and is now exclusively owned by BgpRouter.

Similarly, addToAdvertiseList at line 243-248 pushes locally-originated entries into bgpRoutingTable.

The destructor frees adjRibInTable entries (line 37-38) and _prefixListINOUT entries (line 40-41) but never frees bgpRoutingTable entries. Bgp::removeBgpRoutes() only removes routes from the IP routing table (the other clones), not from bgpRoutingTable.

Suggested change
for (auto& elem : adjRibInTable)
delete elem;
for (auto& elem : adjRibInTable)
delete elem;
for (auto& elem : bgpRoutingTable)
delete elem;
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

ospfModule->insertExternalRoute(ie->getInterfaceId(), ospfNetAddr);
}

updateSendProcess(oldEntry == nullptr ? NEW_ROUTE_ADDED : ROUTE_DESTINATION_CHANGED, selected->getSourceSessionId(), selected);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Dangling pointer used as a boolean flag after the route it points to is freed

A previously-valid pointer is compared to null (oldEntry == nullptr at src/inet/routing/bgpv4/BgpRouter.cc:1009) after the object it pointed to was freed by deleteLocRibEntry at line 964, so the comparison reads an indeterminate pointer value.

Impact: Undefined behavior per the C++ standard; in practice the check works correctly on all real compilers but may confuse sanitizers or future optimizers.

Pointer value used after delete without resetting to nullptr

At line 961, oldEntry is set to current (a valid pointer into bgpRoutingTable). At line 964, deleteLocRibEntry(oldEntry) is called, which at src/inet/routing/bgpv4/BgpRouter.cc:875 executes delete entry — freeing the object that oldEntry points to. After this, oldEntry is a dangling pointer.

At line 1009, oldEntry == nullptr is evaluated to decide between NEW_ROUTE_ADDED and ROUTE_DESTINATION_CHANGED. Since oldEntry was non-null before deletion and was never reassigned, the comparison yields the intended result on all practical implementations. However, reading the value of a pointer after the object it refers to has been deleted is undefined behavior (C++11 §3.7.4.2).

The fix is to capture the boolean before the delete:

bool hadPreviousRoute = (oldEntry != nullptr);

and use hadPreviousRoute at line 1009.

Prompt for agents
In BgpRouter::installBestRouteForPrefix (src/inet/routing/bgpv4/BgpRouter.cc), the variable oldEntry is used after its pointee is freed by deleteLocRibEntry(oldEntry) at line 964. At line 1009, the comparison oldEntry == nullptr reads a dangling pointer value (undefined behavior). The fix is to capture the boolean before the delete: add bool hadPreviousRoute = (current != nullptr); near line 961 (before deleteLocRibEntry is called), and then use hadPreviousRoute instead of oldEntry == nullptr at line 1009.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +129 to +130
if (startupTime == SIMTIME_ZERO)
startupTime = SimTime(1, SIMTIME_S); // TODO what value should go here?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Forced 1-second startup delay on restart overrides user configuration

In Bgp::startBgp() at src/inet/routing/bgpv4/Bgp.cc:129-130, when the router restarts (simTime() > 0) and the user-configured startupTime parameter is 0s, the code forces it to 1 second with a TODO comment. This means the startupTime parameter's documented default of 0s (from src/inet/routing/bgpv4/Bgp.ned:58) is silently overridden on every restart. While the comment acknowledges this is a workaround to avoid race conditions with interface bring-up, it creates non-obvious behavior: a router that started instantly at t=0 will take 1s to restart after a crash/shutdown, regardless of user intent. The value of 1s is arbitrary and may be too long or too short depending on the scenario.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

STAGE_LOCAL, // for changes that don't depend on other modules
STAGE_PHYSICAL_LAYER,
STAGE_LINK_LAYER,
STAGE_NETWORK_INTERFACE_CONFIGURATION,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 New STAGE_NETWORK_INTERFACE_CONFIGURATION may affect other modules' lifecycle ordering

The new STAGE_NETWORK_INTERFACE_CONFIGURATION stage inserted in ModuleOperations.h between STAGE_LINK_LAYER and STAGE_NETWORK_LAYER (for start) and between STAGE_NETWORK_LAYER and STAGE_LINK_LAYER (for stop) changes the numeric values of all subsequent stage enums. Any module that uses a hardcoded integer (rather than the enum name) for its lifecycle stage will now execute at the wrong time. The Ipv6RoutingTable is updated to use this new stage (src/inet/networklayer/ipv6/Ipv6RoutingTable.cc:1046), but other modules should be audited to ensure they still use the correct symbolic stage names and that the new ordering doesn't create unexpected dependencies.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant