axiomhq · Licenser · Feb 12, 2026 · Feb 12, 2026 · Feb 12, 2026 · Feb 12, 2026
diff --git a/README.md b/README.md
@@ -13,6 +13,7 @@ You're welcome.
 - **Finds root causes** — Hypothesis-driven investigation. No hunches. No vibes. Data.
 - **Systematic triage** — Golden signals, USE/RED methods. The stuff you should already know.
 - **Remembers everything** — Persistent memory for patterns, queries, and incidents. Unlike you, I learn.
+- **Metrics querying** — OTel metrics via MPL. Logs via APL. One agent, both engines.
 - **Unified observability** — One config, all your tools. Because having four config files is amateur hour.
 
 ## Installation
@@ -67,9 +68,12 @@ Auth options per deployment:
 ## Usage
 
 ```bash
-# Query logs
+# Query logs (APL)
 scripts/axiom-query prod "['dataset'] | where _time > ago(1h) | where status >= 500 | project _time, message, status | take 10"
 
+# Query metrics (MPL)
+scripts/axiom-metrics-query prod --range 1h <<< "otel-metrics:http.server.request.duration | align to 5m using avg | group by service.name"
+
 # Check what's on fire
 scripts/grafana-alerts prod firing
 
@@ -84,7 +88,7 @@ scripts/slack default chat.postMessage channel=incidents text="Fixed. You're wel
 
 | Category | Scripts |
 |----------|---------|
-| **Axiom** | `axiom-query`, `axiom-api`, `axiom-link`, `axiom-deployments` |
+| **Axiom** | `axiom-query`, `axiom-metrics-query`, `axiom-api`, `axiom-link`, `axiom-deployments` |
 | **Grafana** | `grafana-query`, `grafana-alerts`, `grafana-datasources`, `grafana-api` |
 | **Pyroscope** | `pyroscope-flamegraph`, `pyroscope-diff`, `pyroscope-services`, `pyroscope-api` |
 | **Slack** | `slack` |
@@ -105,7 +109,7 @@ scripts/test-config-toml    # TOML parsing with indented sections
 2. **State facts.** "The logs show X" not "this is probably X."
 3. **Disprove, don't confirm.** Design queries to falsify your hypothesis.
 4. **Time filter first.** Always. No exceptions.
-5. **Discover schema.** Run `getschema` before querying unfamiliar datasets.
+5. **Discover schema.** Run `getschema` (APL) or `--spec` (MPL) before querying unfamiliar datasets.
 
 ## Memory
 

diff --git a/SKILL.core.md b/SKILL.core.md
@@ -155,8 +155,10 @@ Follow this loop strictly.
 
 ### D. EXECUTE (Query)
 - **Select methodology:** Golden Signals (customer-facing health), RED (request-driven services), USE (infrastructure resources)
-- **Select telemetry:** Use whatever's available—metrics, logs, traces, profiles
-- **Run query:** `scripts/axiom-query` (logs), `scripts/grafana-query` (metrics), `scripts/pyroscope-diff` (profiles)
+- **Metrics:** Axiom MetricsDB (`[MPL]` datasets from `scripts/init`), Grafana/PromQL, alerts/dashboards via Grafana
+- **Discover metrics:** `scripts/axiom-metrics-discover` (list metrics, tags, tag values in MetricsDB datasets)
+- **Alerts & dashboards:** Grafana only — `scripts/grafana-alerts`, `scripts/grafana-dashboards`
+- **Run query:** `scripts/axiom-query` (logs/APL), `scripts/axiom-metrics-query` (metrics/MPL), `scripts/grafana-query` (PromQL), `scripts/pyroscope-diff` (profiles)
 
 ### E. VERIFY & REFLECT
 - **Methodology check:** Service → RED. Resource → USE.
@@ -309,7 +311,7 @@ For request-driven services. Measures the *work* the service does.
 | **Errors** | Error rate (5xx / total) |
 | **Duration** | Latency percentiles (p50, p95, p99) |
 
-Measure via logs (APL — see `reference/apl.md`) or metrics (PromQL — see `reference/grafana.md`).
+Measure via logs (APL — see `reference/apl.md`), OTel metrics (MPL — see `reference/metrics.md`), or PromQL fallback (see `reference/grafana.md`).
 
 ### C. USE METHOD (Resources)
 
@@ -321,7 +323,7 @@ For infrastructure resources (CPU, memory, disk, network). Measures the *capacit
 | **Saturation** | Queue depth, load average, waiting threads |
 | **Errors** | Hardware/network errors |
 
-Typically measured via metrics. See `reference/grafana.md` for PromQL patterns.
+Check Axiom MetricsDB first (OTel resource metrics). Fall back to Grafana/PromQL if not available. See `reference/grafana.md` for PromQL patterns.
 
 ### D. DIFFERENTIAL ANALYSIS
 
@@ -358,6 +360,8 @@ See `reference/apl.md` for full operator, function, and pattern reference.
 - **Avoid `search`**—scans ALL fields. Last resort only.
 - **Field escaping**—dots need `\\.`: `['kubernetes.node_labels.nodepool\\.axiom\\.co/name']`
 
+**MetricsDB/MPL:** For OTel metrics (`[MPL]` datasets), discover with `scripts/axiom-metrics-discover`, query with `scripts/axiom-metrics-query`. See `reference/metrics.md`.
+
 **Need more?** Open `reference/apl.md` for operators/functions, `reference/query-patterns.md` for ready-to-use investigation queries.
 
 ---
@@ -374,15 +378,16 @@ Every finding must link to its source — dashboards, queries, error reports, PR
 5. **Data responses**—Any answer citing tool-derived numbers (e.g. burn rates, error counts, usage stats, etc). Questions don't require investigation, but if you cite numbers from a query, include the source link.
 
 **Rule: If you ran a query and cite its results, generate a permalink.** Run the appropriate link tool for every query whose results appear in your response:
-- **Axiom:** `scripts/axiom-link`
+- **Axiom:** `scripts/axiom-link` (works for both APL and MPL queries)
 - **Grafana:** `scripts/grafana-link`
 - **Pyroscope:** `scripts/pyroscope-link`
 - **Sentry:** `scripts/sentry-link`
 
 **Permalinks:**
 ```bash
-# Axiom
+# Axiom (APL or MPL — same script handles both)
 scripts/axiom-link <env> "['logs'] | where status >= 500 | take 100" "1h"
+scripts/axiom-link <env> "dataset:metric.name | align to 5m using avg" "1h"
 # Grafana (metrics)
 scripts/grafana-link <env> <datasource-uid> "rate(http_requests_total[5m])" "1h"
 # Pyroscope (profiling)
@@ -480,20 +485,21 @@ See `reference/postmortem-template.md` for retrospective format.
 
 ## 15. TOOL REFERENCE
 
-### Axiom (Logs & Events)
+### Axiom (Logs & Events — APL)
 ```bash
 scripts/axiom-query <env> <<< "['dataset'] | getschema"
 scripts/axiom-query <env> <<< "['dataset'] | where _time > ago(1h) | project _time, message, level | take 5"
-scripts/axiom-query <env> --ndjson <<< "['dataset'] | where _time > ago(1h) | project _time, message | take 1"
 ```
 
-### Grafana (Metrics)
+### Axiom (MetricsDB — MPL)
 ```bash
-scripts/grafana-query <env> prometheus 'rate(http_requests_total[5m])'
+scripts/axiom-metrics-discover <env> <dataset> metrics|tags|tag-values|search
+scripts/axiom-metrics-query <env> --range 1h <<< "dataset:metric.name | align to 5m using avg"
 ```
 
-### Pyroscope (Profiling)
+### Grafana (PromQL fallback) / Pyroscope / Slack
 ```bash
+scripts/grafana-query <env> prometheus 'rate(http_requests_total[5m])'
 scripts/pyroscope-diff <env> <app_name> -2h -1h -1h now
 ```
 
@@ -518,6 +524,7 @@ scripts/slack-upload <env> <channel> ./file.png --comment "Description" --thread
 
 - `reference/apl.md`—APL operators, functions, and spotlight analysis
 - `reference/axiom.md`—Axiom API endpoints (70+)
+- `reference/metrics.md`—MetricsDB MPL querying, discovery, and patterns
 - `reference/blocks.md`—Slack Block Kit formatting
 - `reference/failure-modes.md`—Common failure patterns
 - `reference/grafana.md`—Grafana queries and PromQL patterns

diff --git a/scripts/axiom-metrics-discover b/scripts/axiom-metrics-discover
@@ -0,0 +1,161 @@
+#!/usr/bin/env bash
+# Axiom MetricsDB info endpoint helper - discover metrics, tags, and tag values
+#
+# Usage: axiom-metrics-discover <deployment> <dataset> [options] <command> [args...]
+#
+# Commands:
+#   metrics                           List all metrics in dataset
+#   tags                              List all tags in dataset
+#   tag-values <tag>                  List values for a tag
+#   metric-tags <metric>              List tags for a metric
+#   metric-tag-values <metric> <tag>  List tag values for metric+tag
+#   search <value>                    Find metrics matching a tag value (POST)
+#
+# Options:
+#   --range <r>     Time range from now (e.g. 1h, 24h, 7d). Default: 1h
+#   --start <ts>    Start time (RFC3339)
+#   --end <ts>      End time (RFC3339)
+#
+# Examples:
+#   axiom-metrics-discover prod otel-metrics metrics
+#   axiom-metrics-discover prod otel-metrics --range 24h tags
+#   axiom-metrics-discover prod otel-metrics tag-values service.name
+#   axiom-metrics-discover prod otel-metrics metric-tags http.server.request.duration
+#   axiom-metrics-discover prod otel-metrics metric-tag-values http.server.request.duration service.name
+#   axiom-metrics-discover prod otel-metrics search "api-gateway"
+
+set -euo pipefail
+
+if [[ $# -lt 3 ]]; then
+  echo "Usage: axiom-metrics-discover <deployment> <dataset> [options] <command> [args...]" >&2
+  exit 1
+fi
+
+DEPLOYMENT="$1"
+DATASET="$2"
+shift 2
+
+START_TIME="${START_TIME:-}"
+END_TIME="${END_TIME:-}"
+RANGE="${RANGE:-}"
+
+# Parse options before command
+while [[ $# -gt 0 ]]; do
+  case "$1" in
+    --start)
+      START_TIME="$2"
+      shift 2
+      ;;
+    --end)
+      END_TIME="$2"
+      shift 2
+      ;;
+    --range)
+      RANGE="$2"
+      shift 2
+      ;;
+    -*)
+      echo "Error: Unknown option '$1'." >&2
+      exit 1
+      ;;
+    *)
+      break
+      ;;
+  esac
+done
+
+if [[ $# -lt 1 ]]; then
+  echo "Error: No command specified. Use: metrics, tags, tag-values, metric-tags, metric-tag-values, search." >&2
+  exit 1
+fi
+
+COMMAND="$1"
+shift
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+# shellcheck disable=SC1091
+source "$SCRIPT_DIR/lib-time"
+
+# Validate time arguments
+if [[ -n "$RANGE" && ( -n "$START_TIME" || -n "$END_TIME" ) ]]; then
+  echo "Error: --range cannot be combined with --start/--end." >&2
+  exit 1
+fi
+
+if [[ -n "$RANGE" ]]; then
+  START_TIME=$(range_to_rfc3339 "$RANGE") || exit 1
+  END_TIME=$(date -u +%Y-%m-%dT%H:%M:%SZ) || exit 1
+  if [[ -z "$START_TIME" || -z "$END_TIME" ]]; then
+    echo "Error: Failed to compute time range from '$RANGE'." >&2
+    exit 1
+  fi
+elif [[ -n "$START_TIME" && -n "$END_TIME" ]]; then
+  : # explicit start/end provided
+elif [[ -n "$START_TIME" || -n "$END_TIME" ]]; then
+  echo "Error: Both --start and --end are required when specifying explicit times." >&2
+  exit 1
+else
+  # Default to 1h
+  START_TIME=$(range_to_rfc3339 "1h") || exit 1
+  END_TIME=$(date -u +%Y-%m-%dT%H:%M:%SZ) || exit 1
+  if [[ -z "$START_TIME" || -z "$END_TIME" ]]; then
+    echo "Error: Failed to compute default time range." >&2
+    exit 1
+  fi
+fi
+
+# URL-encode a path segment
+uriencode() {
+  jq -rn --arg x "$1" '$x|@uri'
+}
+
+DATASET_ENC=$(uriencode "$DATASET")
+BASE="/v1/query/metrics/info/datasets/${DATASET_ENC}"
+QS="start=${START_TIME}&end=${END_TIME}"
+
+case "$COMMAND" in
+  metrics)
+    "$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET "${BASE}/metrics?${QS}" | jq .
+    ;;
+  tags)
+    "$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET "${BASE}/tags?${QS}" | jq .
+    ;;
+  tag-values)
+    if [[ $# -lt 1 ]]; then
+      echo "Error: tag-values requires a <tag> argument." >&2
+      exit 1
+    fi
+    TAG_ENC=$(uriencode "$1")
+    "$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET "${BASE}/tags/${TAG_ENC}/values?${QS}" | jq .
+    ;;
+  metric-tags)
+    if [[ $# -lt 1 ]]; then
+      echo "Error: metric-tags requires a <metric> argument." >&2
+      exit 1
+    fi
+    METRIC_ENC=$(uriencode "$1")
+    "$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET "${BASE}/metrics/${METRIC_ENC}/tags?${QS}" | jq .
+    ;;
+  metric-tag-values)
+    if [[ $# -lt 2 ]]; then
+      echo "Error: metric-tag-values requires <metric> and <tag> arguments." >&2
+      exit 1
+    fi
+    METRIC_ENC=$(uriencode "$1")
+    TAG_ENC=$(uriencode "$2")
+    "$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET "${BASE}/metrics/${METRIC_ENC}/tags/${TAG_ENC}/values?${QS}" | jq .
+    ;;
+  search)
+    if [[ $# -lt 1 ]]; then
+      echo "Error: search requires a <value> argument." >&2
+      exit 1
+    fi
+    BODY=$(jq -nc --arg v "$1" '{"value": $v}')
+    "$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" POST "${BASE}/metrics?${QS}" "$BODY" | jq .
+    ;;
+  *)
+    echo "Error: Unknown command '$COMMAND'. Use: metrics, tags, tag-values, metric-tags, metric-tag-values, search." >&2
+    exit 1
+    ;;
+esac