lib: mgmt: use SOMAXCONN for mgmtd socket listen backlog by reinaldosaraiva · Pull Request #21514 · FRRouting/frr

reinaldosaraiva · 2026-04-14T02:04:07Z

Summary

The mgmtd frontend and backend UNIX sockets pass a compile-time constant of 32 to listen(2) as the accept-queue backlog (MGMTD_MAX_CONN in lib/mgmt_msg.h). Under fan-in from multiple concurrent clients (vtysh sessions, test harnesses, external controllers) the kernel accept queue saturates and new connect(2) attempts fail with EAGAIN before the msg_server handler ever runs.

This PR aligns mgmtd with the convention already used elsewhere in FRR — bgpd/bgp_network.c, bfdd/dplane.c, and pimd/pim_msdp_socket.c all pass SOMAXCONN to listen() — so the backlog defers to the platform default (on Linux, net.core.somaxconn, typically 4096 on modern kernels). The kernel remains the final arbiter of the effective queue length; operators who need a lower cap can still set net.core.somaxconn.

No API change: MGMTD_MAX_CONN keeps its name. An accompanying comment clarifies that it is a listen(2) backlog, not a cap on concurrent sessions (which can confuse readers given the name).

Reproduction

Stress test with ~1000 concurrent writer goroutines each opening its own msg_client connection to /var/run/frr/mgmtd_fe.sock and sending a small EDIT via the native frontend protocol. On an unpatched build:

backlog	connect successes	dial failures
32 (before)	246 / 1000 (24.6%)	754
4096 / `SOMAXCONN` (after)	1000 / 1000 (100%)	0

Kernel: Linux 5.15, net.core.somaxconn=4096. Observable via ss -xlp on the socket path (LISTEN 0 32 → LISTEN 0 4096).

Related Issue

None filed; happy to open one if preferred.

Components

mgmtd, lib

greptile-apps · 2026-04-14T02:06:05Z

Greptile Summary

This PR replaces the hard-coded listen(2) backlog constant 32 (MGMTD_MAX_CONN) in the mgmtd frontend/backend UNIX sockets with SOMAXCONN, deferring to the platform default and aligning with the pattern already used by bgpd, bfdd, and pimd. A clarifying comment is added to prevent MGMTD_MAX_CONN from being misread as a concurrent-session cap. The change is minimal, targeted, and correct.

Confidence Score: 5/5

Safe to merge — single-line change with no logic risk, correct include chain, and consistent with existing FRR conventions.

No P0 or P1 findings. MGMTD_MAX_CONN is used in exactly one place as a listen(2) backlog; SOMAXCONN is always defined before that call site via zebra.h → sys/socket.h. The change is idiomatic within the FRR codebase (bgpd, bfdd, pimd already use SOMAXCONN) and the added comment removes the naming ambiguity.

No files require special attention.

Important Files Changed

Filename	Overview
lib/mgmt_msg.h	Replaces `#define MGMTD_MAX_CONN 32` with `SOMAXCONN` and adds a clarifying block comment; `MGMTD_MAX_CONN` is used in exactly one place (`lib/mgmt_msg.c:876`) as the `listen(2)` backlog, so no other callsites are affected. All translation units that include this header already pull in `<sys/socket.h>` (via `<zebra.h>`) before `MGMTD_MAX_CONN` is referenced, ensuring `SOMAXCONN` is defined at use-time.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["msg_server_init()"] --> B["socket(AF_UNIX)"]
    B --> C["bind(sopath)"]
    C --> D["listen(sock, MGMTD_MAX_CONN)"]
    D -->|"before: 32"| E["accept queue capped at 32\nconnect() → EAGAIN under load"]
    D -->|"after: SOMAXCONN"| F["accept queue defers to\nnet.core.somaxconn (e.g. 4096)"]
    F --> G["Client connects successfully\nunder high fan-in"]

_{Reviews (1): Last reviewed commit: "lib: mgmt: use SOMAXCONN for mgmtd socke..." | Re-trigger Greptile}

ton31337

LGTM

donaldsharp · 2026-04-14T14:27:36Z

ci:rerun

donaldsharp · 2026-04-14T14:28:25Z

this PR looks like it got caught up with the build breakage from yesterday. I've initiated a rerun but in the meantime a rebase + force push would work wonders too

The mgmtd frontend and backend UNIX sockets pass a compile-time constant of 32 to listen(2) as the accept-queue backlog. Under fan-in from multiple concurrent clients (vtysh sessions, test harnesses, external controllers) the kernel accept queue saturates and new connect(2) attempts fail with EAGAIN before the msg_server handler runs. This is observable as a hard ceiling: at roughly 1000 concurrent writers against mgmtd_fe.sock, ~75% of dial attempts fail even with multi-step client-side retry, because the failure is a transport-layer overflow the msg framing never sees. Align mgmtd with the convention already used elsewhere in FRR -- bgpd, bfdd, and pimd all pass SOMAXCONN to listen() -- so the backlog defers to the platform default (on Linux, net.core.somaxconn, typically 4096). The kernel remains the final arbiter of the effective queue length; operators who need a lower cap can still set net.core.somaxconn. No API change; MGMTD_MAX_CONN keeps its name and accompanying comment clarifies that it is a listen backlog, not a cap on concurrent sessions. Signed-off-by: Reinaldo Saraiva <[email protected]>

reinaldosaraiva · 2026-04-14T23:27:36Z

Rebased onto upstream/master e66d35b2ed, no conflicts. Force-pushed as af7fd592ab. Thanks for the heads-up on the transient build breakage.

frrbot bot added the libfrr label Apr 14, 2026

github-actions bot added master size/XS labels Apr 14, 2026

ton31337 approved these changes Apr 14, 2026

View reviewed changes

reinaldosaraiva force-pushed the upstream-submit/ub-11-mgmtd-listen-backlog branch from 742d603 to 128460d Compare April 14, 2026 14:29

vayetze assigned choppsv1 Apr 14, 2026

reinaldosaraiva force-pushed the upstream-submit/ub-11-mgmtd-listen-backlog branch from 128460d to af7fd59 Compare April 14, 2026 23:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lib: mgmt: use SOMAXCONN for mgmtd socket listen backlog#21514

lib: mgmt: use SOMAXCONN for mgmtd socket listen backlog#21514
reinaldosaraiva wants to merge 1 commit intoFRRouting:masterfrom
reinaldosaraiva:upstream-submit/ub-11-mgmtd-listen-backlog

reinaldosaraiva commented Apr 14, 2026

Uh oh!

greptile-apps bot commented Apr 14, 2026

Uh oh!

ton31337 left a comment

Uh oh!

donaldsharp commented Apr 14, 2026

Uh oh!

donaldsharp commented Apr 14, 2026

Uh oh!

reinaldosaraiva commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

reinaldosaraiva commented Apr 14, 2026

Summary

Reproduction

Related Issue

Components

Uh oh!

greptile-apps bot commented Apr 14, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

ton31337 left a comment

Choose a reason for hiding this comment

Uh oh!

donaldsharp commented Apr 14, 2026

Uh oh!

donaldsharp commented Apr 14, 2026

Uh oh!

reinaldosaraiva commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants