From 08e31a16766f48c67df15234f7ab9e3178a69555 Mon Sep 17 00:00:00 2001 From: Claude Date: Sun, 28 Jun 2026 20:35:27 +0000 Subject: [PATCH] Refine OpenTelemetry v2 doc: lead with Quickstart config and env vars --- docs/observability/opentelemetry_v2.md | 181 +++++++++++-------------- 1 file changed, 77 insertions(+), 104 deletions(-) diff --git a/docs/observability/opentelemetry_v2.md b/docs/observability/opentelemetry_v2.md index 735a2c7d..986bf862 100644 --- a/docs/observability/opentelemetry_v2.md +++ b/docs/observability/opentelemetry_v2.md @@ -3,54 +3,60 @@ import TabItem from '@theme/TabItem'; # OpenTelemetry v2 - Full-request tracing -OpenTelemetry v2 (OTel v2) is LiteLLM Proxy's next-generation tracing. It gives you **one clean trace per request** that shows the whole story of a request — the incoming HTTP call, authentication, guardrails, the LLM call itself, and the internal database/cache work — all nested in a single tree. - -It follows standard [OpenTelemetry GenAI semantic conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/), so the traces it produces are readable in any OTel backend (Grafana Tempo, Jaeger, Honeycomb, Datadog, …) and come with ready-made presets for popular LLM observability tools (Arize, Phoenix, Langfuse, Weave, Langtrace, Levo, AgentOps). +OpenTelemetry v2 (OTel v2) is LiteLLM Proxy's next-generation tracing. Each request becomes one clean trace covering the incoming HTTP call, authentication, guardrails, the LLM call, and the internal database and cache work, all nested in a single tree. It follows the standard [OpenTelemetry GenAI semantic conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/), so traces are readable in any OTel backend (Grafana Tempo, Jaeger, Honeycomb, Datadog) and ship with presets for the popular LLM observability tools (Arize, Phoenix, Langfuse, Weave, Langtrace, Levo, AgentOps). :::info Opt-in feature -OTel v2 is **off by default**. Nothing in it runs until you set `LITELLM_OTEL_V2=true`. It is separate from the existing [OpenTelemetry integration](./opentelemetry_integration) — pick one. If you are moving from v1, see [Migrating to OpenTelemetry v2](./opentelemetry_v2_migration). +OTel v2 is off by default; nothing in it runs until you set `LITELLM_OTEL_V2=true`. It is separate from the existing [OpenTelemetry integration](./opentelemetry_integration); pick one. If you are moving from v1, see [Migrating to OpenTelemetry v2](./opentelemetry_v2_migration). ::: -## New to OpenTelemetry? +## Quickstart -If you have never used OpenTelemetry, these are the only terms you need for this page: +Two environment variables turn tracing on with the `console` exporter, which needs no extra infrastructure and prints each finished span to your terminal. This assumes a running LiteLLM proxy (by default on `http://localhost:4000`) with at least one model in `config.yaml`; if you do not have one, start with the [proxy getting-started guide](../proxy/docker_quick_start) and come back. -- **Trace** — the full record of one request. -- **Span** — one step inside a trace, such as the HTTP request, the auth check, or the LLM call. Spans nest to form the tree shown below. -- **Exporter** — the piece that sends finished spans somewhere. The simplest one, `console`, just prints them to your terminal. -- **Collector** — a separate network service that receives spans. You only need one when exporting over `otlp_http` or `otlp_grpc`, not for the console exporter. -- **Backend** — the system that receives, stores, and displays traces so you can search and visualize them. This can be a general tracing tool such as Jaeger, Grafana Tempo, or Datadog, or an LLM-focused tool such as Langfuse, Arize, or Phoenix. +Set the variables in the same environment your proxy runs in (export them in the shell, add them to your `.env`, or pass them with `docker -e`), then start the proxy: -If you just want to see something work, jump to the [Quickstart](#quickstart). To send to one of the LLM tools above with a single line of config, see [presets](#2-send-traces-to-a-specific-tool-presets). +```shell +export LITELLM_OTEL_V2=true +export OTEL_EXPORTER="console" -## What you get +litellm --config config.yaml +``` -A single request to your proxy produces **one trace** that looks like this: +Send a request to a model your proxy serves, authenticated with your proxy key in place of `sk-1234`: -``` -POST /v1/chat/completions ← HTTP request (server span) -├── auth /v1/chat/completions ← authentication -│ ├── postgres get_key_object ← DB lookups during auth -│ └── postgres get_team_membership -├── execute_guardrail presidio-pii ← each guardrail that runs -├── chat gpt-4o ← the LLM call (model, tokens, cost) -└── batch_write_to_db ← spend/usage written to DB +```shell +curl http://localhost:4000/v1/chat/completions \ + -H "Authorization: Bearer sk-1234" \ + -H "Content-Type: application/json" \ + -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Say hello in one word"}]}' ``` -Highlights: +A span prints to the proxy's stdout. To send traces somewhere durable, point the exporter at an [OTLP collector](#send-traces-to-any-otlp-collector) or a [vendor preset](#send-traces-to-a-specific-tool-presets) below. -- **One trace, end to end** — the HTTP request, auth, guardrails, the LLM call, and DB writes all live in the same trace, correctly nested. -- **Rich GenAI attributes** — every LLM-call span carries `gen_ai.*` attributes: model, provider, token usage, cost, finish reasons, request parameters, and more. -- **Standards-based** — built on the official OpenTelemetry GenAI semantic conventions, so it works with any OTel-compatible backend. -- **Vendor presets** — one line to ship traces to Arize, Phoenix, Langfuse, Weave, Langtrace, Levo, or AgentOps in the format each tool expects. -- **Safe by default** — prompts and responses are **not** captured unless you explicitly opt in. Noisy routes (health checks, metrics scrapes, UI assets) are excluded automatically. -- **Distributed tracing** — if your client sends a `traceparent` header, LiteLLM's spans nest inside your existing trace. +### Configuration reference -## Requirements +All settings are environment variables; boolean flags accept `true`/`false`. -OTel v2 instruments the proxy's FastAPI app, so it needs the OpenTelemetry SDK plus the FastAPI instrumentation package: +| Variable | Default | Purpose | +|---|---|---| +| `LITELLM_OTEL_V2` | `false` | Master switch. OTel v2 does nothing until this is `true`. | +| `OTEL_EXPORTER` (alias `OTEL_EXPORTER_OTLP_PROTOCOL`) | `console` | Exporter kind: `console`, `otlp_http`, `otlp_grpc`. | +| `OTEL_ENDPOINT` (alias `OTEL_EXPORTER_OTLP_ENDPOINT`) | none | OTLP collector URL. Setting an endpoint implies `otlp_http` unless you override `OTEL_EXPORTER`. | +| `OTEL_HEADERS` (alias `OTEL_EXPORTER_OTLP_HEADERS`) | none | Comma-separated `key=value` auth headers for your backend. | +| `OTEL_SERVICE_NAME` | `litellm` | `service.name` resource attribute shown in your backend. | +| `OTEL_ENVIRONMENT_NAME` | none | `deployment.environment` resource attribute (e.g. `production`). | +| `OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT` | `no_content` | Prompt/response capture: `no_content`, `span_only`, `event_only`, `span_and_event`. | +| `OTEL_PYTHON_FASTAPI_EXCLUDED_URLS` | health/metrics/UI routes | Comma-separated paths to exclude from tracing (substring match). Set to `""` to trace everything. | +| `LITELLM_OTEL_INTEGRATION_ENABLE_METRICS` | `false` | Also emit the GenAI client metrics (duration, token usage, cost, streaming timings). See [Metrics](#metrics). | +| `LITELLM_OTEL_LEGACY_COMPAT` | `true` | Also emit attributes under the older Traceloop key names. See [Attribute conventions](#attribute-conventions). | + +The full set of keys on each span kind is in [Span attributes](#span-attributes). + +### Requirements + +OTel v2 instruments the proxy's FastAPI app, so it needs the OpenTelemetry SDK plus the FastAPI instrumentation package. These ship with the proxy Docker image; install them manually only for a `pip`-based proxy: ```shell pip install "litellm[proxy]" \ @@ -60,39 +66,25 @@ pip install "litellm[proxy]" \ opentelemetry-instrumentation-fastapi ``` -> These packages ship with the proxy Docker image. You only need to install them manually for a `pip`-based proxy. - -## Quickstart - -The fastest way to see a trace, with no extra infrastructure, is the `console` exporter, which prints each finished span to your terminal. This assumes you already have a running LiteLLM proxy (by default on `http://localhost:4000`) with at least one model defined in `config.yaml`. If you do not, set that up first with the [proxy getting-started guide](../proxy/docker_quick_start) and come back. - -Set two environment variables in the same environment your proxy runs in (export them in the shell, add them to your `.env`, or pass them with `docker -e`), then start the proxy: +## What you get -```shell -export LITELLM_OTEL_V2=true -export OTEL_EXPORTER="console" +A single request to your proxy produces one trace that looks like this: -litellm --config config.yaml ``` - -Send a request to a model your proxy serves. Here it is `gpt-4o`, authenticated with your proxy key in place of `sk-1234`: - -```shell -curl http://localhost:4000/v1/chat/completions \ - -H "Authorization: Bearer sk-1234" \ - -H "Content-Type: application/json" \ - -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Say hello in one word"}]}' +POST /v1/chat/completions ← HTTP request (server span) +├── auth /v1/chat/completions ← authentication +│ ├── postgres get_key_object ← DB lookups during auth +│ └── postgres get_team_membership +├── execute_guardrail presidio-pii ← each guardrail that runs +├── chat gpt-4o ← the LLM call (model, tokens, cost) +└── batch_write_to_db ← spend/usage written to DB ``` -A span prints to the proxy's stdout. To understand the full shape and what each field means, see [Seeing your traces](#seeing-your-traces) and [Span attributes](#span-attributes). Once this works, point the exporter at a real destination with [Getting started](#getting-started). +The HTTP request, auth, guardrails, the LLM call, and DB writes all live in one correctly nested trace. Every LLM-call span carries rich `gen_ai.*` attributes: model, provider, token usage, cost, finish reasons, and request parameters. Because it builds on the official OpenTelemetry GenAI semantic conventions, it works with any OTel-compatible backend, and one-line vendor presets ship traces to Arize, Phoenix, Langfuse, Weave, Langtrace, Levo, or AgentOps in each tool's native format. Prompts and responses are not captured unless you explicitly opt in, and noisy routes (health checks, metrics scrapes, UI assets) are excluded automatically. If your client sends a `traceparent` header, LiteLLM's spans nest inside your existing trace. -## Getting started +## Send traces to any OTLP collector -Start with the [Quickstart](#quickstart) above to confirm tracing works using the `console` exporter, which needs no other setup. When you are ready to send traces somewhere durable, pick a destination below. - -### 1. Send traces to any OTLP collector - -This path sends spans over OTLP (the OpenTelemetry Protocol) to a collector or backend you are already running at the endpoint below; if you do not have one yet, stay on the console exporter from the [Quickstart](#quickstart) until you do. Set the feature flag plus the standard `OTEL_*` environment variables in the proxy's environment. No config change is needed. +This path sends spans over OTLP (the OpenTelemetry Protocol) to a collector or backend you already run; if you do not have one yet, stay on the console exporter until you do. Set the feature flag plus the standard `OTEL_*` environment variables in the proxy's environment. No config change is needed. @@ -120,23 +112,19 @@ OTEL_ENDPOINT="http://localhost:4317" -Pass auth headers your backend needs via `OTEL_HEADERS`: +Pass any auth headers your backend needs via `OTEL_HEADERS`, then start the proxy: ```shell OTEL_HEADERS="api-key=your-key,x-tenant=acme" -``` -Then start the proxy as usual: - -```shell litellm --config config.yaml ``` -Make a request, and you'll see one trace per request in your backend. +Make a request and you will see one trace per request in your backend. -### 2. Send traces to a specific tool (presets) +## Send traces to a specific tool (presets) -For LLM observability tools, use a **preset**. A preset knows the tool's endpoint and emits attributes in the schema that tool expects. To enable one, add its name to `callbacks` in your config and set the tool's credentials as env vars. +For LLM observability tools, use a preset. A preset knows the tool's endpoint and emits attributes in the schema that tool expects. To enable one, add its name to `callbacks` in your config and set the tool's credentials as env vars. @@ -298,15 +286,15 @@ Every preset turns into one exporter on a single shared tracer. The table lists, | Arize Phoenix | `arize_phoenix` | `PHOENIX_API_KEY` | `PHOENIX_COLLECTOR_HTTP_ENDPOINT` or `PHOENIX_COLLECTOR_ENDPOINT` (gRPC), `PHOENIX_PROJECT_NAME` | Phoenix (self-hosted or Phoenix Cloud) | OpenInference | No | | Langfuse | `langfuse_otel` | `LANGFUSE_PUBLIC_KEY`, `LANGFUSE_SECRET_KEY` | `LANGFUSE_HOST` (or `LANGFUSE_OTEL_HOST`; default `https://us.cloud.langfuse.com`, EU is `https://cloud.langfuse.com`) | Langfuse Cloud or self-hosted | Langfuse | Yes | | Weave (W&B) | `weave_otel` | `WANDB_API_KEY`, `WANDB_PROJECT_ID` (`/`) | `WANDB_HOST` (default `https://trace.wandb.ai`) | Weights & Biases Weave | OpenInference + Weave | Yes | -| Langtrace | `langtrace` | none of its own | — | Langtrace, via an OpenTelemetry Collector (Langtrace ingests JSON-only OTLP) | Langtrace | No | -| Levo | `levo` | `LEVOAI_API_KEY`, `LEVOAI_ORG_ID`, `LEVOAI_WORKSPACE_ID`, `LEVOAI_COLLECTOR_URL` | — | Levo collector | canonical `gen_ai.*` only | No | +| Langtrace | `langtrace` | none of its own | (none) | Langtrace, via an OpenTelemetry Collector (Langtrace ingests JSON-only OTLP) | Langtrace | No | +| Levo | `levo` | `LEVOAI_API_KEY`, `LEVOAI_ORG_ID`, `LEVOAI_WORKSPACE_ID`, `LEVOAI_COLLECTOR_URL` | (none) | Levo collector | canonical `gen_ai.*` only | No | | AgentOps | `agentops` | `AGENTOPS_API_KEY` | `AGENTOPS_SERVICE_NAME`, `AGENTOPS_ENVIRONMENT` | AgentOps (`https://otlp.agentops.cloud`) | canonical `gen_ai.*` only | No | Notes: - **Arize AX vs Arize Phoenix** are different backends from the same company. AX (`arize`) is the hosted platform; Phoenix (`arize_phoenix`) is the open-source tracer you self-host or run on Phoenix Cloud. They use different credentials and endpoints, so pick the callback for the backend you actually run. You can also enable both at once to send to each. - **Langtrace** ingests JSON-only OTLP at a custom path, so litellm v2 (which sends protobuf to `/v1/traces`) cannot export to it directly. Route through an OpenTelemetry Collector that re-encodes to JSON; the `langtrace` preset only adds the Langtrace attribute schema to your spans. See the Langtrace tab above for the collector config. -- Vocabulary is additive: every preset's spans always carry the canonical OpenTelemetry `gen_ai.*` attributes; the listed vocabulary is layered on top so the destination tool reads its native schema. +- Vocabulary is additive: every preset's spans always carry the canonical OpenTelemetry `gen_ai.*` attributes, and the listed vocabulary is layered on top so the destination tool reads its native schema. ## Seeing your traces @@ -340,7 +328,7 @@ Run the [Quickstart](#quickstart) request against the `console` exporter and the The `gen_ai.system`, `gen_ai.usage.*_tokens`, and `llm.is_streaming` keys come from the default `legacy` compatibility mapper; set `LITELLM_OTEL_LEGACY_COMPAT=false` to keep only the canonical keys. -Once a backend is configured with its preset, the same request shows up in that tool's UI as a `chat gpt-4o` span under the request root. The panels below leave a slot for a screenshot of that trace in each backend. To fill one in, capture the trace your request produced, save it under `static/img/observability/`, and replace the placeholder path; crop to the span list plus the attribute panel and scrub any credential headers first. +Once a backend is configured with its preset, the same request shows up in that tool's UI as a `chat gpt-4o` span under the request root. @@ -396,25 +384,25 @@ Open the Langtrace UI; the spans flow through your existing OTLP collector carry ## Capturing prompts & responses -By default, OTel v2 records **metadata only** (model, tokens, cost, timing) and **never** writes prompt or response text to your traces. This is intentional — it keeps sensitive content out of your observability backend. +By default, OTel v2 records metadata only (model, tokens, cost, timing) and never writes prompt or response text to your traces. This is intentional; it keeps sensitive content out of your observability backend. To capture message content, opt in explicitly: ```shell -# no_content (default) — never capture prompts/responses +# no_content (default): never capture prompts/responses OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT="no_content" -# span_only — write prompts/responses as attributes on spans +# span_only: write prompts/responses as attributes on spans OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT="span_only" -# event_only — write prompts/responses on log events instead of span attributes +# event_only: write prompts/responses on log events instead of span attributes OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT="event_only" -# span_and_event — write content to both spans and events +# span_and_event: write content to both spans and events OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT="span_and_event" ``` -The gate is enforced centrally, so it applies to **every** backend at once — a user request can never force its prompt into your backend while capture is disabled. +The gate is enforced centrally, so it applies to every backend at once; a user request can never force its prompt into your backend while capture is disabled. ## Span attributes @@ -461,15 +449,15 @@ Status, errors, and a few conditional keys on the LLM-call span: ### Other span kinds -**Guardrail span** — uses the `litellm.guardrail.*` namespace: `name`, `mode`, `status`, `provider`, `action`, `response`, `violation_categories`, `confidence_score`, `risk_score`, `masked_entity_count`, `duration`, `id`, `policy_template`, `detection_method`. `status` is one of `success`, `guardrail_intervened`, `guardrail_failed_to_respond`, or `not_run`. +**Guardrail span** uses the `litellm.guardrail.*` namespace: `name`, `mode`, `status`, `provider`, `action`, `response`, `violation_categories`, `confidence_score`, `risk_score`, `masked_entity_count`, `duration`, `id`, `policy_template`, `detection_method`. `status` is one of `success`, `guardrail_intervened`, `guardrail_failed_to_respond`, or `not_run`. -**Datastore span** (redis, postgres) — `db.system.name` and `db.operation.name`, alongside `litellm.service.name` and `litellm.service.call_type`. +**Datastore span** (redis, postgres) carries `db.system.name` and `db.operation.name`, alongside `litellm.service.name` and `litellm.service.call_type`. -**Internal service span** — only the `litellm.service.*` keys (no `db.*`). +**Internal service span** carries only the `litellm.service.*` keys (no `db.*`). -**MCP tool-call span** — `gen_ai.operation.name` (`execute_tool`), `mcp.method.name`, `mcp.session.id`, `gen_ai.tool.name`, `litellm.mcp.server.name`, `litellm.call_id`, and `litellm.cost.total`. The tool arguments and result are gated by the same content-capture setting as prompt content. +**MCP tool-call span** carries `gen_ai.operation.name` (`execute_tool`), `mcp.method.name`, `mcp.session.id`, `gen_ai.tool.name`, `litellm.mcp.server.name`, `litellm.call_id`, and `litellm.cost.total`. The tool arguments and result are gated by the same content-capture setting as prompt content. -**Root server span** — the HTTP semantic-convention keys `http.request.method`, `http.route`, `http.response.status_code`, and `url.path`, stamped by the FastAPI instrumentation. +**Root server span** carries the HTTP semantic-convention keys `http.request.method`, `http.route`, `http.response.status_code`, and `url.path`, stamped by the FastAPI instrumentation. ## Attribute conventions @@ -491,7 +479,7 @@ Request-identity values are promoted into OpenTelemetry Baggage on the LLM-call ## Metrics -Alongside traces, OTel v2 can emit GenAI **client metrics**: histograms for call latency, token usage, and cost that your backend aggregates across requests. Like the rest of OTel v2 they stay off until you turn them on. +Alongside traces, OTel v2 can emit GenAI client metrics: histograms for call latency, token usage, and cost that your backend aggregates across requests. Like the rest of OTel v2 they stay off until you turn them on. Set the flag in the proxy environment next to `LITELLM_OTEL_V2`: @@ -560,7 +548,7 @@ callback_settings: ## Which routes are traced -High-frequency, non-LLM routes are **excluded by default** so they don't flood your traces: health checks (`/health*`), the Prometheus scrape (`/metrics`), and static UI/docs assets (`/ui`, `/docs`, `/redoc`, `/_next`, `/openapi.json`, favicons, …). +High-frequency, non-LLM routes are excluded by default so they do not flood your traces: health checks (`/health*`), the Prometheus scrape (`/metrics`), and static UI/docs assets (`/ui`, `/docs`, `/redoc`, `/_next`, `/openapi.json`, and favicons). To change the set, use the standard OpenTelemetry env var (comma-separated paths, substring-matched): @@ -574,7 +562,7 @@ OTEL_PYTHON_FASTAPI_EXCLUDED_URLS="/health,/internal" ## Per-key / per-team destinations (multi-tenant) -Some presets support **per-request credentials**: if a request carries team- or key-scoped credentials, its spans are routed to that tenant's project automatically. This lets one proxy serve many tenants, each seeing only their own traces, with no extra setup beyond configuring those credentials on the key or team. +Some presets support per-request credentials: if a request carries team- or key-scoped credentials, its spans are routed to that tenant's project automatically. This lets one proxy serve many tenants, each seeing only their own traces, with no extra setup beyond configuring those credentials on the key or team. The presets that support this, and the dynamic params they read, are: @@ -588,26 +576,11 @@ A request's dynamic credentials are scoped to the exporter that owns them. When ## Distributed tracing -If the incoming request has a W3C `traceparent` header, LiteLLM continues that trace instead of starting a new one. Your LiteLLM spans then appear inline inside whatever distributed trace your application already has — so you can follow a request from your app, through the proxy, to the LLM provider, in one view. +If the incoming request has a W3C `traceparent` header, LiteLLM continues that trace instead of starting a new one. Your LiteLLM spans then appear inline inside whatever distributed trace your application already has, so you can follow a request from your app, through the proxy, to the LLM provider, in one view. -## Configuration reference +## OpenTelemetry terms -All values are environment variables. Boolean flags accept `true`/`false`. - -| Variable | Default | Purpose | -|---|---|---| -| `LITELLM_OTEL_V2` | `false` | **Master switch.** OTel v2 does nothing until this is `true`. | -| `OTEL_EXPORTER` (alias `OTEL_EXPORTER_OTLP_PROTOCOL`) | `console` | Exporter kind: `console`, `otlp_http`, `otlp_grpc`. | -| `OTEL_ENDPOINT` (alias `OTEL_EXPORTER_OTLP_ENDPOINT`) | none | OTLP collector URL. Setting an endpoint implies `otlp_http` unless you override `OTEL_EXPORTER`. | -| `OTEL_HEADERS` (alias `OTEL_EXPORTER_OTLP_HEADERS`) | none | Comma-separated `key=value` auth headers for your backend. | -| `OTEL_SERVICE_NAME` | `litellm` | `service.name` resource attribute shown in your backend. | -| `OTEL_ENVIRONMENT_NAME` | none | `deployment.environment` resource attribute (e.g. `production`). | -| `OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT` | `no_content` | Prompt/response capture: `no_content`, `span_only`, `event_only`, `span_and_event`. | -| `OTEL_PYTHON_FASTAPI_EXCLUDED_URLS` | health/metrics/UI routes | Comma-separated paths to exclude from tracing (substring match). Set to `""` to trace everything. | -| `LITELLM_OTEL_INTEGRATION_ENABLE_METRICS` | `false` | Also emit the GenAI client metrics (duration, token usage, cost, streaming timings). See [Metrics](#metrics). | -| `LITELLM_OTEL_LEGACY_COMPAT` | `true` | Also emit attributes under the older Traceloop key names. See [Attribute conventions](#attribute-conventions). | - -The full set of keys on each span kind is in [Span attributes](#span-attributes). +A trace is the full record of one request. A span is one step inside it, such as the HTTP request, the auth check, or the LLM call, and spans nest to form the tree shown above. An exporter sends finished spans somewhere; the simplest one, `console`, prints them to your terminal. A collector is a separate network service that receives spans, needed only when exporting over `otlp_http` or `otlp_grpc`. A backend stores and displays traces, whether a general tracing tool such as Jaeger, Grafana Tempo, or Datadog, or an LLM-focused tool such as Langfuse, Arize, or Phoenix. ## How it works @@ -622,13 +595,13 @@ Not every internal call becomes a span. Outbound datastore calls (redis, postgre **No traces showing up?** 1. Confirm `LITELLM_OTEL_V2=true` is set in the proxy's environment. -2. Try `OTEL_EXPORTER="console"` first — if spans print to stdout, the problem is your exporter endpoint/headers, not LiteLLM. +2. Try `OTEL_EXPORTER="console"` first; if spans print to stdout, the issue is your exporter endpoint or headers rather than LiteLLM. 3. Make sure you hit an LLM route (e.g. `/v1/chat/completions`). Health checks and UI routes are excluded by default. 4. Check that `opentelemetry-instrumentation-fastapi` is installed (see [Requirements](#requirements)). -**Only see the LLM call but no `auth`/`postgres`/server span?** Those server and DB spans require the FastAPI instrumentation package — install `opentelemetry-instrumentation-fastapi`. +**Only see the LLM call but no `auth`/`postgres`/server span?** Those server and DB spans require the FastAPI instrumentation package; install `opentelemetry-instrumentation-fastapi`. -**I see metadata but no prompts/responses.** That's the default. Set `OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=span_only` to capture content. +**I see metadata but no prompts/responses.** That is the default. Set `OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=span_only` to capture content. ## Support