Skip to content

feat(tracing): enrich OTel resource, compress OTLP, annotate spans#5502

Open
martinconic wants to merge 5 commits into
feat/opentelemetry-migrationfrom
feat/otel-resource-enrichment
Open

feat(tracing): enrich OTel resource, compress OTLP, annotate spans#5502
martinconic wants to merge 5 commits into
feat/opentelemetry-migrationfrom
feat/otel-resource-enrichment

Conversation

@martinconic

@martinconic martinconic commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Checklist

  • I have read the coding guide.
  • My change requires a documentation update, and I have done it.
  • I have added tests to cover my changes.
  • I have filled out the description and linked the related issues.

Description

Follow-up to the OpenTelemetry migration (#5456) that addresses the reviewers' feedback. It targets feat/opentelemetry-migration rather than master, so the migration PR stays focused on the OpenTracing→OTel swap while the review changes are collected and reviewed here. Once merged, these commits become part of #5456.

Without these changes, every node's spans look identical in an OTLP backend and a request cannot be followed end to end. This PR makes spans attributable per node, network and release, trims export bandwidth, surfaces tracing problems, and lets a value such as the postage batch id follow an upload across hops.

Resource attributesservice.version, deployment.environment (derived from the network id), host.name, and service.instance.id (the node's overlay address). Env-based options (WithFromEnv/WithTelemetrySDK/WithHost) are applied before the configured attributes, so configured values win while in NewBee was moved below overlay finalization so the (immutable) resource can carry the final overlay as the instance id.

Exporter & processor — gzip compression on the HTTP and gRPC OTLP clients; the batch span processor keeps SDK defaults and honors the standard OTEL_BSP_* env vars (documented, no new flags).

Observability of tracing itself — a confirmation log line once tracing is wired up (endpoint, protocol, sampling ratio), and OTLP exporter errors (e.g. an unreachable collector) are routed through the OTel global error handler to the node logger instead of being silently dropped.

Span attributes & new spans — chunk address on the netstore get/put spans and peer_address on the pingpong spans; new spans on the previously untraced subsystems: postage-batch-create (with batch_id), kademlia-connect (with peer_address), and salud-round (parenting the per-peer status snapshots).

Baggage propagation — W3C baggage now propagates across HTTP (composite TraceContext + Baggage propagator) and p2p (a separate additive tracing-baggage header; the existing span-context carrier is unchanged and peers that don't understand the header simply ignore it). The API attaches the batch_id as baggage when stamping a chunk, so it follows the chunk into the direct-upload/pushsync path.

Flag rename (breaking) — the tracing flags drop the otlp infix: tracing-otlp-endpoint/-insecure/-ca-file/-protocoltracing-endpoint/-insecure/-ca-file/-protocol (and the nested tracing.otlp-* config keys likewise). tracing-endpoint is reused as the active OTLP collector endpoint, so it is removed from the deprecated no-op keys (tracing-host/tracing-port remain deprecated).

Open API Spec Version Changes (if applicable)

Motivation and Context (Optional)

Addresses both review passes from @darkobas2 on #5456 (resource enrichment / exporter hardening / span coverage / baggage, and the flag rename).

Related Issue (Optional)

Stacked on #5456.

Screenshots (if appropriate):

AI Disclosure

  • This PR contains code that has been generated by an LLM.
  • I have reviewed the AI generated code thoroughly.
  • I possess the technical expertise to responsibly review the code generated in this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant