feat: Add Phase 1 process-wide memory limiter#2542
feat: Add Phase 1 process-wide memory limiter#2542jmacd merged 22 commits intoopen-telemetry:mainfrom
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2542 +/- ##
==========================================
- Coverage 88.46% 88.41% -0.05%
==========================================
Files 618 620 +2
Lines 226148 228394 +2246
==========================================
+ Hits 200066 201943 +1877
- Misses 25558 25927 +369
Partials 524 524
🚀 New features to boost your workflow:
|
jmacd
left a comment
There was a problem hiding this comment.
Looks good to me.
I sort of imagine a configurable "stall" in the receivers when the system is in a soft-limit. Maybe 0 by default, but adding a small duration to delay inputs to allow memory to fall and would be complementary with per-tenant or per-pipeline memory admission limits.
|
@lalitb In my opinion, this is moving in the right direction overall. However, my main concern is that the current enforcement path introduces process wide shared state that is consulted directly from ingress hot paths. That adds hidden cross thread coordination, which is not fully aligned with our thread-per-core, share-nothing, NUMA-aware direction. I think a small architectural adjustment would make this fit much better: keep the global sampler and pressure classification, but propagate state transitions through the control plane and let each pinned receiver thread maintain its own local admission state. That would preserve the same high level behavior while keeping the fast path local and more predictable from a cache and NUMA perspective. Longer term, I think we should move toward hierarchical memory budgeting with local fast-path admission: a process-wide budget, then per-NUMA or per-core budgets or leases underneath it, with the current process wide limiter acting more as a supervisory guardrail than the primary admission mechanism. |
|
The sampler could emit a #[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub struct MemoryPressureChanged {
/// Monotonic update number assigned by the global sampler.
pub generation: u64,
/// Newly classified pressure level.
pub level: MemoryPressureLevel,
/// Receiver-facing retry hint to use while shedding ingress.
pub retry_after_secs: u32,
/// Most recent sampled process memory usage in bytes.
pub usage_bytes: u64,
} |
|
I support @lquerel's proposal. |
|
@lquerel Thanks for the review. I agree with this direction. I will rework the enforcement path so the sampler and classification remain process-wide, but pressure transitions are propagated through the pipeline control plane and each receiver maintains receiver-local admission state. The intended shape is:
That keeps the Phase 1 behavior the same while removing direct process-wide state reads from ingress hot paths and fitting the pinned/share-nothing/NUMA-aware direction better. |
lquerel
left a comment
There was a problem hiding this comment.
@lalitb Could you also update the README.md in the config crate and https://github.com/open-telemetry/otel-arrow/blob/main/rust/otap-dataflow/docs/configuration-model.md as well to describe this new policy
|
@lalitb please update the design document and PR description with the changes involving NodeControlMessage and local state management. |
43638e2
Summary
This PR adds a process-wide memory limiter to the Rust collector.
The limiter samples process memory on a fixed interval, classifies pressure as
Normal,Soft, orHard, and exposes that state through metrics and logs.In
enforcemode, receivers reject new ingress only atHard. Inobserve_onlymode, the limiter remains telemetry-only.This complements the collector's existing bounded-buffer backpressure. It does
not add per-group budgets, byte accounting, or strict process-memory caps.
The current implementation keeps sampling and pressure classification
process-wide, but propagates pressure transitions through the pipeline control
plane and lets each receiver maintain receiver-local admission state. This keeps
the enforcement hot path local while preserving the same Phase 1 behavior.
Basic Functionality
Normal,Soft, orHardSoftinformational in Phase 1Hardinenforcemode/readyzunderHardpressure inenforcemodeobserve_onlymode for metrics and logs without enforcementHardwhen supported by the buildHow to Review
Start with docs/memory-limiter-phase1.md for scope, semantics, and operator-facing behavior.
Then review the core limiter and controller wiring:
Then review receiver enforcement paths: