-
Notifications
You must be signed in to change notification settings - Fork 1.5k
bgp_ls_ted: fix issue of withdrawing edges #21490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1155,6 +1155,25 @@ int bgp_ls_process_message(struct bgp *bgp, struct ls_message *msg) | |
| if (BGP_DEBUG(zebra, ZEBRA) || BGP_DEBUG(linkstate, LINKSTATE)) | ||
| zlog_debug("%s: Node vertex key=%" PRIu64, __func__, vertex->key); | ||
|
|
||
| if (msg->event == LS_MSG_EVENT_DELETE) { | ||
| /* | ||
| * Before destroying the vertex, withdraw all Link NLRIs | ||
| * that reference it (both outgoing and incoming edges). | ||
| * Once the vertex is gone, edge->source or edge->destination | ||
| * becomes NULL and we can no longer build the NLRI to withdraw. | ||
| */ | ||
| struct ls_edge *e; | ||
| struct listnode *lnode; | ||
| for (ALL_LIST_ELEMENTS_RO(vertex->outgoing_edges, lnode, e)) | ||
| bgp_ls_process_edge(bgp, e, LS_MSG_EVENT_DELETE); | ||
| for (ALL_LIST_ELEMENTS_RO(vertex->incoming_edges, lnode, e)) | ||
| bgp_ls_process_edge(bgp, e, LS_MSG_EVENT_DELETE); | ||
| /* Also withdraw Prefix NLRIs attached to this vertex */ | ||
| struct ls_subnet *s; | ||
| for (ALL_LIST_ELEMENTS_RO(vertex->prefixes, lnode, s)) | ||
| bgp_ls_process_subnet(bgp, s, LS_MSG_EVENT_DELETE); | ||
| } | ||
|
|
||
| bgp_ls_process_vertex(bgp, vertex, msg->event); | ||
|
|
||
| if (msg->event == LS_MSG_EVENT_DELETE) | ||
|
|
@@ -1163,6 +1182,13 @@ int bgp_ls_process_message(struct bgp *bgp, struct ls_message *msg) | |
| break; | ||
|
|
||
| case LS_MSG_TYPE_ATTRIBUTES: | ||
| if (msg->event == LS_MSG_EVENT_UPDATE) { | ||
| struct ls_edge *old_edge; | ||
| old_edge = ls_find_edge_by_source(bgp->ls_info->ted, | ||
| msg->data.attr); | ||
| if (old_edge) | ||
| bgp_ls_process_edge(bgp, old_edge, LS_MSG_EVENT_DELETE); | ||
| } | ||
|
Comment on lines
+1185
to
+1191
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
A more robust approach is to look up the old edge using the existing key stored in TED before the attribute is updated, or to check whether Prompt To Fix With AIThis is a comment left during a code review.
Path: bgpd/bgp_ls_ted.c
Line: 1185-1191
Comment:
**Stale withdrawal misses when edge key changes on UPDATE**
`ls_find_edge_by_source` derives the lookup key from `msg->data.attr` (the *new* attributes). If the UPDATE changes the attribute that forms the edge key (e.g. the local interface address used by `get_edge_key`), the old edge in TED will have a different key and won't be found here — leaving the old BGP route unwithdrawn. `ls_msg2edge` will then insert a new edge under the new key, and the peer ends up holding both the stale old NLRI and the fresh new one.
A more robust approach is to look up the old edge using the existing key stored in TED *before* the attribute is updated, or to check whether `ls_msg2edge` returns a newly-inserted edge vs. an in-place update, and only send the pre-update withdraw when a key change is detected.
How can I resolve this? If you propose a fix, please make it concise.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this looks valid |
||
| edge = ls_msg2edge(bgp->ls_info->ted, msg, false); | ||
| if (!edge) { | ||
| zlog_err("%s: Failed to convert message to edge", __func__); | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flog_erron partial-deletion of connected edgesAfter vertex A is deleted,
ls_vertex_delcallsls_disconnect(A, B→A, false), settingB→A->destination = NULLwhile leaving the edge in B'soutgoing_edgeslist. When vertex B is subsequently deleted, this new loop encountersB→Awithdestination == NULLandbgp_ls_process_edgefiresflog_err(EC_BGP_LS_PACKET, "BGP-LS: Edge missing source or destination vertex")— even though the withdrawal was already correctly sent during A'sincoming_edgesloop. In a large topology this generates one false error per affected edge per vertex-deletion sequence.Add a guard before each
bgp_ls_process_edgecall to skip edges that have already been partially disconnected:Prompt To Fix With AI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks valid