bgp_ls_ted: fix issue of withdrawing edges by a114j0y · Pull Request #21490 · FRRouting/frr

a114j0y · 2026-04-09T23:00:09Z

withdraw stale Edges before updating attr
withdraw related Links before deleting the Vertex

1. withdraw stale Edges before updating attr 2. withdraw related Links before deleting the Vertex Signed-off-by: Yijiao Qin <y.qin@alibaba-inc.com>

greptile-apps · 2026-04-09T23:05:52Z

Greptile Summary

This PR fixes two BGP-LS TED withdrawal gaps: it pre-withdraws outgoing/incoming edges and attached prefix NLRIs before a vertex is torn down (so source/destination pointers are still valid), and it pre-withdraws the old edge NLRI before an attribute UPDATE is applied in-place.

Spurious flog_err on vertex-deletion sequences: after vertex A is deleted, ls_vertex_del sets B→A->destination = NULL but leaves the edge in B's outgoing_edges. When B is later deleted the new loop calls bgp_ls_process_edge on that half-disconnected edge and logs a false EC_BGP_LS_PACKET error; adding if (e->source && e->destination) guards resolves this.
UPDATE with key change may leave stale NLRI: ls_find_edge_by_source is keyed on the new attributes; if the local-address component of the key changed, the old edge is not found and its BGP route is not withdrawn before the new one is advertised.

Confidence Score: 3/5

Safe to merge after the spurious-error and potential stale-NLRI concerns are addressed; the core withdrawal-before-delete logic is sound.

Two issues remain: a P1 spurious flog_err that fires on every half-disconnected edge during sequential vertex deletions (correct BGP behavior but noisy and misleading in production), and a P2 edge-key-change gap in the UPDATE withdraw path that can leave stale NLRIs on peers. The fundamental approach is correct and well-reasoned.

bgpd/bgp_ls_ted.c — specifically the outgoing/incoming edge loops inside the LS_MSG_EVENT_DELETE block and the ls_find_edge_by_source lookup in the LS_MSG_EVENT_UPDATE block.

Important Files Changed

Filename	Overview
bgpd/bgp_ls_ted.c	Adds pre-deletion BGP withdrawal of related edges/prefixes before vertex teardown, and a pre-update withdrawal of the old edge before attribute replacement; the vertex-deletion loops lack NULL guards that can trigger a spurious flog_err per half-disconnected edge, and the UPDATE-path key-change scenario may leave stale NLRIs in peers.

Sequence Diagram

sequenceDiagram
    participant Z as Zebra
    participant P as bgp_ls_process_message
    participant E as bgp_ls_process_edge
    participant V as bgp_ls_process_vertex
    participant T as ls_vertex_del_all

    Z->>P: NODE DELETE vertex A
    P->>P: ls_msg2vertex vertex A
    loop outgoing edges of A
        P->>E: bgp_ls_process_edge DELETE
        E-->>P: BGP withdraw sent
    end
    loop incoming edges of A
        P->>E: bgp_ls_process_edge DELETE
        E-->>P: BGP withdraw sent
    end
    P->>V: bgp_ls_process_vertex DELETE
    V-->>P: Node NLRI withdrawn
    P->>T: ls_vertex_del_all A
    T-->>P: A freed, peer edge dest set to NULL

    Z->>P: NODE DELETE vertex B
    loop outgoing edges of B
        P->>E: bgp_ls_process_edge DELETE
        E-->>P: flog_err dest is NULL spurious error
    end
    P->>T: ls_vertex_del_all B

    Z->>P: ATTR UPDATE
    P->>P: ls_find_edge_by_source with new attrs
    alt old edge found same key
        P->>E: bgp_ls_process_edge DELETE old edge
        E-->>P: stale NLRI withdrawn
    else key changed old edge not found
        P-->>P: stale NLRI remains on peer
    end
    P->>P: ls_msg2edge update attrs in place
    P->>E: bgp_ls_process_edge UPDATE
    E-->>P: new NLRI advertised

Prompt To Fix All With AI

This is a comment left during a code review.
Path: bgpd/bgp_ls_ted.c
Line: 1167-1170

Comment:
**Spurious `flog_err` on partial-deletion of connected edges**

After vertex A is deleted, `ls_vertex_del` calls `ls_disconnect(A, B→A, false)`, setting `B→A->destination = NULL` while leaving the edge in B's `outgoing_edges` list. When vertex B is subsequently deleted, this new loop encounters `B→A` with `destination == NULL` and `bgp_ls_process_edge` fires `flog_err(EC_BGP_LS_PACKET, "BGP-LS: Edge missing source or destination vertex")` — even though the withdrawal was already correctly sent during A's `incoming_edges` loop. In a large topology this generates one false error per affected edge per vertex-deletion sequence.

Add a guard before each `bgp_ls_process_edge` call to skip edges that have already been partially disconnected:

```suggestion
			for (ALL_LIST_ELEMENTS_RO(vertex->outgoing_edges, lnode, e)) {
				if (e->source && e->destination)
					bgp_ls_process_edge(bgp, e, LS_MSG_EVENT_DELETE);
			}
			for (ALL_LIST_ELEMENTS_RO(vertex->incoming_edges, lnode, e)) {
				if (e->source && e->destination)
					bgp_ls_process_edge(bgp, e, LS_MSG_EVENT_DELETE);
			}
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: bgpd/bgp_ls_ted.c
Line: 1185-1191

Comment:
**Stale withdrawal misses when edge key changes on UPDATE**

`ls_find_edge_by_source` derives the lookup key from `msg->data.attr` (the *new* attributes). If the UPDATE changes the attribute that forms the edge key (e.g. the local interface address used by `get_edge_key`), the old edge in TED will have a different key and won't be found here — leaving the old BGP route unwithdrawn. `ls_msg2edge` will then insert a new edge under the new key, and the peer ends up holding both the stale old NLRI and the fresh new one.

A more robust approach is to look up the old edge using the existing key stored in TED *before* the attribute is updated, or to check whether `ls_msg2edge` returns a newly-inserted edge vs. an in-place update, and only send the pre-update withdraw when a key change is detected.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "bgp_ls_ted: fix issue of withdrawing edg..." | Re-trigger Greptile}

greptile-apps · 2026-04-09T23:05:56Z

bgpd/bgp_ls_ted.c

+			for (ALL_LIST_ELEMENTS_RO(vertex->outgoing_edges, lnode, e))
+				bgp_ls_process_edge(bgp, e, LS_MSG_EVENT_DELETE);
+			for (ALL_LIST_ELEMENTS_RO(vertex->incoming_edges, lnode, e))
+				bgp_ls_process_edge(bgp, e, LS_MSG_EVENT_DELETE);


Spurious flog_err on partial-deletion of connected edges

After vertex A is deleted, ls_vertex_del calls ls_disconnect(A, B→A, false), setting B→A->destination = NULL while leaving the edge in B's outgoing_edges list. When vertex B is subsequently deleted, this new loop encounters B→A with destination == NULL and bgp_ls_process_edge fires flog_err(EC_BGP_LS_PACKET, "BGP-LS: Edge missing source or destination vertex") — even though the withdrawal was already correctly sent during A's incoming_edges loop. In a large topology this generates one false error per affected edge per vertex-deletion sequence.

Add a guard before each bgp_ls_process_edge call to skip edges that have already been partially disconnected:

Suggested change

for (ALL_LIST_ELEMENTS_RO(vertex->outgoing_edges, lnode, e))

bgp_ls_process_edge(bgp, e, LS_MSG_EVENT_DELETE);

for (ALL_LIST_ELEMENTS_RO(vertex->incoming_edges, lnode, e))

bgp_ls_process_edge(bgp, e, LS_MSG_EVENT_DELETE);

for (ALL_LIST_ELEMENTS_RO(vertex->outgoing_edges, lnode, e)) {

if (e->source && e->destination)

bgp_ls_process_edge(bgp, e, LS_MSG_EVENT_DELETE);

}

for (ALL_LIST_ELEMENTS_RO(vertex->incoming_edges, lnode, e)) {

if (e->source && e->destination)

bgp_ls_process_edge(bgp, e, LS_MSG_EVENT_DELETE);

}

Prompt To Fix With AI

This is a comment left during a code review. Path: bgpd/bgp_ls_ted.c Line: 1167-1170 Comment: **Spurious `flog_err` on partial-deletion of connected edges** After vertex A is deleted, `ls_vertex_del` calls `ls_disconnect(A, B→A, false)`, setting `B→A->destination = NULL` while leaving the edge in B's `outgoing_edges` list. When vertex B is subsequently deleted, this new loop encounters `B→A` with `destination == NULL` and `bgp_ls_process_edge` fires `flog_err(EC_BGP_LS_PACKET, "BGP-LS: Edge missing source or destination vertex")` — even though the withdrawal was already correctly sent during A's `incoming_edges` loop. In a large topology this generates one false error per affected edge per vertex-deletion sequence. Add a guard before each `bgp_ls_process_edge` call to skip edges that have already been partially disconnected: ```suggestion for (ALL_LIST_ELEMENTS_RO(vertex->outgoing_edges, lnode, e)) { if (e->source && e->destination) bgp_ls_process_edge(bgp, e, LS_MSG_EVENT_DELETE); } for (ALL_LIST_ELEMENTS_RO(vertex->incoming_edges, lnode, e)) { if (e->source && e->destination) bgp_ls_process_edge(bgp, e, LS_MSG_EVENT_DELETE); } ``` How can I resolve this? If you propose a fix, please make it concise.

this looks valid

greptile-apps · 2026-04-09T23:05:57Z

bgpd/bgp_ls_ted.c

+		if (msg->event == LS_MSG_EVENT_UPDATE) {
+			struct ls_edge *old_edge;
+			old_edge = ls_find_edge_by_source(bgp->ls_info->ted,
+							  msg->data.attr);
+			if (old_edge)
+				bgp_ls_process_edge(bgp, old_edge, LS_MSG_EVENT_DELETE);
+		}


Stale withdrawal misses when edge key changes on UPDATE

ls_find_edge_by_source derives the lookup key from msg->data.attr (the new attributes). If the UPDATE changes the attribute that forms the edge key (e.g. the local interface address used by get_edge_key), the old edge in TED will have a different key and won't be found here — leaving the old BGP route unwithdrawn. ls_msg2edge will then insert a new edge under the new key, and the peer ends up holding both the stale old NLRI and the fresh new one.

A more robust approach is to look up the old edge using the existing key stored in TED before the attribute is updated, or to check whether ls_msg2edge returns a newly-inserted edge vs. an in-place update, and only send the pre-update withdraw when a key change is detected.

Prompt To Fix With AI

This is a comment left during a code review. Path: bgpd/bgp_ls_ted.c Line: 1185-1191 Comment: **Stale withdrawal misses when edge key changes on UPDATE** `ls_find_edge_by_source` derives the lookup key from `msg->data.attr` (the *new* attributes). If the UPDATE changes the attribute that forms the edge key (e.g. the local interface address used by `get_edge_key`), the old edge in TED will have a different key and won't be found here — leaving the old BGP route unwithdrawn. `ls_msg2edge` will then insert a new edge under the new key, and the peer ends up holding both the stale old NLRI and the fresh new one. A more robust approach is to look up the old edge using the existing key stored in TED *before* the attribute is updated, or to check whether `ls_msg2edge` returns a newly-inserted edge vs. an in-place update, and only send the pre-update withdraw when a key change is detected. How can I resolve this? If you propose a fix, please make it concise.

this looks valid

donaldsharp · 2026-04-14T15:27:28Z

@cscarpitta can you look at this one.

bgp_ls_ted: fix issue of withdrawing edges

3dd9ab4

1. withdraw stale Edges before updating attr 2. withdraw related Links before deleting the Vertex Signed-off-by: Yijiao Qin <y.qin@alibaba-inc.com>

frrbot bot added the bugfix label Apr 9, 2026

github-actions bot added size/S master labels Apr 9, 2026

greptile-apps bot reviewed Apr 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bgp_ls_ted: fix issue of withdrawing edges#21490

bgp_ls_ted: fix issue of withdrawing edges#21490
a114j0y wants to merge 1 commit intoFRRouting:masterfrom
a114j0y:master

a114j0y commented Apr 9, 2026

Uh oh!

greptile-apps bot commented Apr 9, 2026

Uh oh!

greptile-apps bot Apr 9, 2026

Uh oh!

riw777 Apr 14, 2026

Uh oh!

greptile-apps bot Apr 9, 2026

Uh oh!

riw777 Apr 14, 2026

Uh oh!

donaldsharp commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

a114j0y commented Apr 9, 2026

Uh oh!

greptile-apps bot commented Apr 9, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

riw777 Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

riw777 Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

donaldsharp commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants