Pmsi tunnel v6 length#21488
Conversation
Greptile SummaryThis PR fixes the encoding and decoding of the PMSI Tunnel BGP attribute (RFC 6514) for IPv6-only EVPN underlay networks. Previously, the parser rejected PMSI attributes with a 21-byte length (the correct size for an IPv6 tunnel identifier) and the encoder always wrote a 4-byte IPv4 tunnel endpoint with length 9. The fix adds named length constants, updates the parser to accept both length 9 and 21 for Confidence Score: 5/5Safe to merge — fix is minimal and correct, with a dedicated topotest validating the end-to-end behaviour. Both the parser and encoder changes are mathematically sound (V4=9, V6=21), the encoding condition correctly detects IPv6-only nexthops, and the new test exercises the full EVPN type-3 exchange over IPv6. No regressions are expected for the existing IPv4 path. All remaining concerns are P2 or lower. No files require special attention.
|
| Filename | Overview |
|---|---|
| bgpd/bgp_attr.c | Parser now accepts PMSI length 21 (IPv6) in addition to 9 (IPv4); encoder correctly branches on IPv6 nexthop to write 16-byte tunnel ID with length 21. |
| bgpd/bgp_attr.h | Adds named constants BGP_ATTR_PMSI_TUNNEL_V4_LENGTH (9) and BGP_ATTR_PMSI_TUNNEL_V6_LENGTH (21), replacing previous magic number 9. |
| tests/topotests/bgp_evpn_v6_pmsi_tunnel/test_bgp_evpn_v6_pmsi_tunnel.py | New topotest covering BGP session establishment and EVPN type-3 IMET route exchange with IPv6-only VTEP peering; validates that PMSI attributes with 21-byte length are correctly parsed. |
| tests/topotests/bgp_evpn_v6_pmsi_tunnel/r1/frr.conf | R1 router config: IPv6-only peering to r2 (fd00:100::2) with l2vpn evpn and advertise-all-vni. |
| tests/topotests/bgp_evpn_v6_pmsi_tunnel/r1/evpn_type3.json | Expected JSON for r1: VNI 101 with one remote VTEP at fd00:100::2. |
| tests/topotests/bgp_evpn_v6_pmsi_tunnel/r2/frr.conf | R2 router config: IPv6-only peering to r1 (fd00:100::1) with l2vpn evpn and advertise-all-vni. |
| tests/topotests/bgp_evpn_v6_pmsi_tunnel/r2/evpn_type3.json | Expected JSON for r2: VNI 101 with one remote VTEP at fd00:100::1. |
Sequence Diagram
sequenceDiagram
participant R1 as R1 (fd00:100::1)
participant R2 as R2 (fd00:100::2)
R1->>R2: BGP UPDATE (EVPN Type-3 IMET) PMSI len=21 tunnel-id=fd00:100::1
R2->>R1: BGP UPDATE (EVPN Type-3 IMET) PMSI len=21 tunnel-id=fd00:100::2
Note over R1: bgp_attr_pmsi_tunnel() length==21 now accepted skips 16-byte IPv6 ID
Note over R2: bgp_attr_pmsi_tunnel() length==21 now accepted skips 16-byte IPv6 ID
R1->>R1: Installs remote VTEP fd00:100::2
R2->>R2: Installs remote VTEP fd00:100::1
Reviews (2): Last reviewed commit: "tests: Add a v6 pmsi tunnel only test." | Re-trigger Greptile
| if (attr->nexthop.s_addr == INADDR_ANY && | ||
| attr->mp_nexthop_len == BGP_ATTR_NHLEN_IPV6_GLOBAL) { |
There was a problem hiding this comment.
Missing
BGP_ATTR_NHLEN_IPV6_GLOBAL_AND_LL case in encoding condition
When attr->nexthop.s_addr == INADDR_ANY and attr->mp_nexthop_len == BGP_ATTR_NHLEN_IPV6_GLOBAL_AND_LL (32), the condition evaluates to false and the code falls through to the IPv4 branch, writing 0.0.0.0 as the tunnel endpoint with length 9. The same file at line 5912–5913 uses an || to handle both nexthop-length variants for IPv6. The fix should mirror that pattern:
| if (attr->nexthop.s_addr == INADDR_ANY && | |
| attr->mp_nexthop_len == BGP_ATTR_NHLEN_IPV6_GLOBAL) { | |
| if (attr->nexthop.s_addr == INADDR_ANY && | |
| (attr->mp_nexthop_len == BGP_ATTR_NHLEN_IPV6_GLOBAL || | |
| attr->mp_nexthop_len == BGP_ATTR_NHLEN_IPV6_GLOBAL_AND_LL)) { |
Prompt To Fix With AI
This is a comment left during a code review.
Path: bgpd/bgp_attr.c
Line: 5658-5659
Comment:
**Missing `BGP_ATTR_NHLEN_IPV6_GLOBAL_AND_LL` case in encoding condition**
When `attr->nexthop.s_addr == INADDR_ANY` and `attr->mp_nexthop_len == BGP_ATTR_NHLEN_IPV6_GLOBAL_AND_LL` (32), the condition evaluates to false and the code falls through to the IPv4 branch, writing `0.0.0.0` as the tunnel endpoint with length 9. The same file at line 5912–5913 uses an `||` to handle both nexthop-length variants for IPv6. The fix should mirror that pattern:
```suggestion
if (attr->nexthop.s_addr == INADDR_ANY &&
(attr->mp_nexthop_len == BGP_ATTR_NHLEN_IPV6_GLOBAL ||
attr->mp_nexthop_len == BGP_ATTR_NHLEN_IPV6_GLOBAL_AND_LL)) {
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
somthing like this might also work (in combination with the parsing change)
if (CHECK_FLAG(attr->flag, ATTR_FLAG_BIT(BGP_ATTR_PMSI_TUNNEL))) {
uint8_t tunn_id_len;
uint8_t *nh;
struct in_addr tunn_id;
stream_putc(s, BGP_ATTR_FLAG_OPTIONAL | BGP_ATTR_FLAG_TRANS);
stream_putc(s, BGP_ATTR_PMSI_TUNNEL);
if (IS_MAPPED_IPV6(&attr->tunn_id)) {
stream_putc(s, 9); // Length
tunn_id_len = IPV4_MAX_BYTELEN;
ipv4_mapped_ipv6_to_ipv4(&attr->tunn_id, &tunn_id);
nh = (uint8_t*)&tunn_id;
} else {
stream_putc(s, 21); // Length
tunn_id_len = IPV6_MAX_BYTELEN;
nh = (uint8_t*) &attr->tunn_id;
}
stream_putc(s, 0); // Flags
stream_putc(s, bgp_attr_get_pmsi_tnl_type(attr));
stream_put(s, &(attr->label),
BGP_LABEL_BYTES); // MPLS Label / VXLAN VNI
// Unicast tunnel endpoint IP address
stream_put(s, nh, tunn_id_len);
}
There was a problem hiding this comment.
yep makes sense fixed.
| stream_get(&attr->label, connection->curr, BGP_LABEL_BYTES); | ||
|
|
||
| /* Forward read pointer of input stream. */ | ||
| stream_forward_getp(connection->curr, length - attr_parse_len); |
There was a problem hiding this comment.
isn't this a bug - we've been advancing the read pointer as we parsed the stream?
There was a problem hiding this comment.
I read this as forwarding the stream past the v4 or v6 address. I did not understand why it was not storing the value.
There was a problem hiding this comment.
Mark and I have discussed this in slack. Currently it is our understanding that EVPN does not use the v4 or v6 address. the BGP rfc's have reused the PMSI encoding that was created for MVPN. FRR does not support it yet so just ignoring the value is ok at the moment from my perspective. There is nothing to do with it at this point in time.
|
|
||
| bgp_attr_set(attr, BGP_ATTR_PMSI_TUNNEL); | ||
| bgp_attr_set_pmsi_tnl_type(attr, tnl_type); | ||
| stream_get(&attr->label, connection->curr, BGP_LABEL_BYTES); |
There was a problem hiding this comment.
we need to read the address? like
if (length == 9) {
struct in_addr tunn_id;
tunn_id.s_addr = stream_get_ipv4(peer->curr);
ipv4_to_ipv4_mapped_ipv6(&attr->tunn_id, tunn_id);
attr_parse_len += IPV4_MAX_BYTELEN;
} else {
stream_get(&attr->tunn_id, peer->curr, IPV6_MAX_BYTELEN);
attr_parse_len += IPV6_MAX_BYTELEN;
}
There was a problem hiding this comment.
It is my understanding that the tunnel address presented is meaningless in EVPN encodings. At this point in time FRR does not support other encoding usages of PMSI. When it does it can and will need to grab this data.
| if (attr->nexthop.s_addr == INADDR_ANY && | ||
| attr->mp_nexthop_len == BGP_ATTR_NHLEN_IPV6_GLOBAL) { |
There was a problem hiding this comment.
somthing like this might also work (in combination with the parsing change)
if (CHECK_FLAG(attr->flag, ATTR_FLAG_BIT(BGP_ATTR_PMSI_TUNNEL))) {
uint8_t tunn_id_len;
uint8_t *nh;
struct in_addr tunn_id;
stream_putc(s, BGP_ATTR_FLAG_OPTIONAL | BGP_ATTR_FLAG_TRANS);
stream_putc(s, BGP_ATTR_PMSI_TUNNEL);
if (IS_MAPPED_IPV6(&attr->tunn_id)) {
stream_putc(s, 9); // Length
tunn_id_len = IPV4_MAX_BYTELEN;
ipv4_mapped_ipv6_to_ipv4(&attr->tunn_id, &tunn_id);
nh = (uint8_t*)&tunn_id;
} else {
stream_putc(s, 21); // Length
tunn_id_len = IPV6_MAX_BYTELEN;
nh = (uint8_t*) &attr->tunn_id;
}
stream_putc(s, 0); // Flags
stream_putc(s, bgp_attr_get_pmsi_tnl_type(attr));
stream_put(s, &(attr->label),
BGP_LABEL_BYTES); // MPLS Label / VXLAN VNI
// Unicast tunnel endpoint IP address
stream_put(s, nh, tunn_id_len);
}
Current code is only writing v4 addresses as a tunnel length, additionally the decode would only handle v4 addresses for the decode. Modify the code to write a v6 address when the attribute has a v6 global nexthop to use, else write a v4 address. Modify the decode to read either v4 or v6 address lengths as well. Testing is done already implicitly in the current set of tests. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
We need to ensure that older versions of FRR continue to work, pre commit c732e0a. So change the encode function to prefer a v4 address encode if the attribute has both a v4 and a v6 address available. If only v6, then encode v6. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add a v6 only test that shows that v6 encoding actually works. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
9838060 to
6bb1602
Compare
|
@greptile review |
| stream_putc(s, BGP_ATTR_FLAG_OPTIONAL | BGP_ATTR_FLAG_TRANS); | ||
| stream_putc(s, BGP_ATTR_PMSI_TUNNEL); | ||
| if (attr->mp_nexthop_len == BGP_ATTR_NHLEN_IPV6_GLOBAL) { | ||
| if (attr->nexthop.s_addr == INADDR_ANY && |
There was a problem hiding this comment.
Shouldn't we check for the attribute flag (bgp_attr_exists()) instead?
There was a problem hiding this comment.
so from a quick scan of the source code the attr->mp_nexthop_len is just checked like I do here, the same with attr->nexthop. So I was just following coding conventions. If you think we should switch to a different way of doing it, that is fine but I think it belongs in it's own cleanup work since we would need to touch a bunch of different spots.
|
|
||
| #define BGP_ATTR_PMSI_TUNNEL_V4_LENGTH 9 | ||
| #define BGP_ATTR_PMSI_TUNNEL_V6_LENGTH 21 | ||
|
|
There was a problem hiding this comment.
These lengths look specific to PMSI_TNLTYPE_INGR_REPL. Are they valid for other PMSI tunnel types too? If not, should the constant names include INGR_REPL to make that scope explicit?
There was a problem hiding this comment.
From some very superficial reading of the RFC's associated here (rfc6514 and rfc7988) I believe that it applies to a bunch of pmsi tunnel types but since we do not have those programmed yet I did not do anything other than make the generic tunnel lengths. Hopefully someone will program the other types in the near future
|
Problem reporter has tried this fix and it works for them now. |
| "ip link set vxlan-101 up type bridge_slave learning off flood off mcast_flood off", | ||
| ] | ||
|
|
||
| for rname, vtep_ip in [("r1", "fd00:100::1"), ("r2", "fd00:100::2")]: |
There was a problem hiding this comment.
there are already about 5-10 different topotests that show v4 behavior, I don't think I need to write a v4 specific test. I just wanted to ahve a new test that had absolutely no v4 addresses at all.
|
@donaldsharp What happened with this one? I was hoping to have it in 10.6.1 to finally close that particular chapter in the EVPN saga. |
|
Was fixed in a different PR because there was a decision to approach the problem from a slightly different way. See #21507 |
Thanks a million. I'll have to wait for the next release, I guess. |
Fixes v6 decoding of a pmsi tunnel using a pure v6 address.