Skip to content

feat(external-api): round-trip dashboard containers, tabs, and tile container/tab refs#2201

Open
alex-fedotyev wants to merge 6 commits intomainfrom
alex/HDX-2150-external-api-containers-tabs
Open

feat(external-api): round-trip dashboard containers, tabs, and tile container/tab refs#2201
alex-fedotyev wants to merge 6 commits intomainfrom
alex/HDX-2150-external-api-containers-tabs

Conversation

@alex-fedotyev
Copy link
Copy Markdown
Contributor

@alex-fedotyev alex-fedotyev commented May 5, 2026

Summary

PR #2015 added a dashboard organization layer (containers with optional tabs, plus per-tile containerId and tabId) but the v2 external API was not updated to round-trip the new fields. External integrations that build dashboards programmatically had no way to use the new layer.

This wires the full set of fields through CREATE / GET / LIST / UPDATE on /api/v2/dashboards. Dashboards saved without containers round-trip unchanged.

Closes #2150. Follow-up to #2015 (commit 7665fbe).

What's in scope

  • Dashboard body Zod schema gains containers: DashboardContainer[]? (imported from @hyperdx/common-utils) and the tile schema gains containerId? and tabId?.
  • convertToExternalDashboard now emits containers (only when at least one is present, so dashboards without the layer round-trip with the field absent).
  • convertTileToExternalChart and convertToInternalTileConfig propagate containerId and tabId. The legacy series-format translator in externalApi.ts also propagates them so both code paths preserve the fields.
  • The containers: 1 projection is added to the Mongoose find and findOne calls.
  • New cross-field validation on the body schema:
    • container ids unique within a dashboard
    • tab ids unique within a container
    • tile containerId resolves to a real container
    • tile tabId resolves to a tab inside that container
    • tile tabId requires containerId to be set
  • OpenAPI JSDoc additions for DashboardContainer, DashboardContainerTab, the new tile fields, and the new dashboard field on Dashboard / CreateDashboardRequest / UpdateDashboardRequest. openapi.json regenerated.
  • A changeset entry.

Out of scope

Each item below has a tracking issue so the gap is visible after merge.

Tier

The triage classifier marks packages/api/src/routers/external-api/v2/* as critical-path, so this lands as Tier 4 by directory rule, even though the diff is small (~284 prod lines) and additive. Splitting further would separate the body schema, the conversion utilities, and the route wiring from each other and not actually reduce review burden. Happy to break this up if there's a preferred way to slice it.

Test plan

  • yarn ci:lint (lint + tsc + spectral) on @hyperdx/common-utils, @hyperdx/api, @hyperdx/app
  • yarn knip (no new unused exports)
  • Integration: yarn jest dashboards.test.ts -t "Containers and tabs", all 8 new tests pass
  • Integration: full yarn jest dashboards.test.ts, 86/86 tests pass (no regressions in old or new format suites)
  • Integration: yarn jest src/mcp/__tests__/dashboards.test.ts, 19/19 MCP dashboard tests pass (the MCP body schema shares with the external API body schema, so this confirms the new validations don't break the MCP path)
  • openapi.json regenerated and committed; spectral lint passes

@vercel
Copy link
Copy Markdown

vercel Bot commented May 5, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
hyperdx-oss Ready Ready Preview, Comment May 8, 2026 2:24am

Request Review

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 5, 2026

🦋 Changeset detected

Latest commit: 815a063

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 4 packages
Name Type
@hyperdx/api Minor
@hyperdx/common-utils Patch
@hyperdx/app Minor
@hyperdx/otel-collector Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link
Copy Markdown

vercel Bot commented May 5, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
hyperdx-oss Building Building Preview, Comment May 5, 2026 8:19pm

Request Review

@github-actions github-actions Bot added the review/tier-4 Critical — deep review + domain expert sign-off label May 5, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

🔴 Tier 4 — Critical

Touches auth, data models, config, tasks, OTel pipeline, ClickHouse, or CI/CD.

Why this tier:

  • Critical-path files (2):
    • packages/api/src/routers/external-api/v2/dashboards.ts
    • packages/api/src/routers/external-api/v2/utils/dashboards.ts
  • Cross-layer change: touches backend (packages/api) + shared utils (packages/common-utils)

Review process: Deep review from a domain expert. Synchronous walkthrough may be required.
SLA: Schedule synchronous review within 2 business days.

Stats
  • Production files changed: 6
  • Production lines changed: 704 (+ 1059 in test files, excluded from tier calculation)
  • Branch: alex/HDX-2150-external-api-containers-tabs
  • Author: alex-fedotyev

To override this classification, remove the review/tier-4 label and apply a different review/tier-* label. Manual overrides are preserved on subsequent pushes.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

PR Review

✅ No critical issues found.

This is a well-scoped follow-up to #2015 that wires containers/tabs/tile containerId/tabId through the v2 external API. The implementation is careful and the test coverage (round-trip, validation, boundary, legacy-doc heal, PUT-omit-containers, orphan-heal) is thorough.

A few non-blocking observations for the team's consideration:

  • Error format consistencycollectTileContainerRefIssues returns path: message strings joined with '; ' and surfaces them via { message }. Other handler-level 400s in this file use prose (Could not find the following source IDs: ...). Not wrong, but slightly inconsistent — fine to leave.
  • PUT validates body against existing dashboard's containers when body omits containers (v2/dashboards.ts:1993-2003). This is correct per the documented preserve-on-omit semantics and is tested. Worth noting that the body schema only does structure checks (duplicate ids/tabs); cross-tile reference resolution depends on the handler. Anyone removing the handler-level collectTileContainerRefIssues call would silently weaken validation — a code comment near the schema-level structure check pointing at the handler would be defensive.
  • Legacy duplicate container ids in stored docsconvertToExternalDashboard builds containerById last-write-wins (v2/utils/dashboards.ts:402-404). Self-heal logs orphan-ref drops but won't surface that the wrong container was matched if a legacy doc has duplicate ids. Acceptable since save paths now reject duplicates; pre-existing data only.
  • TileSchema.containerId/tabId tightened to .min(1) in common-utils/types.ts. The external API normalizes empty strings to absent on read, so this is safe for legacy docs going through the v2 API. Worth verifying no UI save path persists empty strings (a quick grep shows nothing obvious).

The split into structure-only + tile-ref helpers (validateDashboardContainersStructure / validateDashboardTileContainerRefs) plus the collectTileContainerRefIssues wrapper handles the PUT-omit-containers case cleanly. Caps on identifiers/arrays (DASHBOARD_CONTAINER_ID_MAX, DASHBOARD_MAX_CONTAINERS, DASHBOARD_MAX_TILES) are sensible.

Out-of-scope items (MCP, heatmap, repeat, onClick) are tracked in linked issues per the PR description.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

E2E Test Results

All tests passed • 165 passed • 3 skipped • 1162s

Status Count
✅ Passed 165
❌ Failed 0
⚠️ Flaky 4
⏭️ Skipped 3

Tests ran across 4 shards in parallel.

View full report →

alex-fedotyev added a commit that referenced this pull request May 5, 2026
After reading notes/principles/external-api-audit.md and walking
through the UI surface (useDashboardContainers.tsx, DashboardContainer.tsx,
GroupTabBar.tsx), three gaps were caught that the initial implementation
missed.

- OpenAPI parity: TileBase.containerId / tabId now declare minLength: 1
  to match the Zod schema's z.string().min(1).optional(). The Zod fix
  landed in the previous commit but the OpenAPI didn't pick up the
  constraint until JSDoc was updated and openapi.json regenerated.

- Test gap: explicit empty containers: [] now has its own round-trip
  test. The conversion normalizes [] back to absent on read (the
  existing length-guard makes this work), but the behavior wasn't
  asserted.

- Test gap: tile.containerId or tile.tabId set to an empty string is
  now explicitly rejected. Previously this would have failed
  cross-field validation only because no real container has id "",
  not because the tile-level rule fired.

UI invariants the API stays permissive about (auto-fixed by the UI
rather than rejected) are documented in the per-feature code map
under notes/repo-conventions/hyperdx/dashboards-containers-tabs.md
in the workspace.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

Compound Engineering Review

Five specialist reviewers (TypeScript-quality, security, performance, architecture, simplicity) ran in parallel against the diff vs. base 2785597a. No P0s. Five P1s, mostly clustered around missing input caps, a missed projection cost, and an under-typed Mongoose update.

P1 — should resolve before merge

  • P1 packages/api/src/utils/zod.ts:423-424 + :474tile.containerId/tabId have only min(1) (no .max) and externalDashboardTileListSchema has no .max on the array. With the 32MB body limit (api-app.ts:55) and tiles: Schema.Types.Mixed, a client can persist a huge tile list with multi-MB id strings that round-trip back through every GET → cap ids at DASHBOARD_CONTAINER_ID_MAX (256) to mirror DashboardContainerSchema, and add .max(N) (e.g. 500) to the tile list schema.
  • P1 packages/api/src/routers/external-api/v2/dashboards.ts:1398 — LIST endpoint now selects containers for every dashboard. Worst case ≈310KB × 200 dashboards = ~62MB JSON per LIST → drop containers from EXTERNAL_DASHBOARD_PROJECTION and require GET-by-id to read it (or paginate LIST).
  • P1 packages/api/src/routers/external-api/v2/dashboards.ts:1948setPayload: Record<string, unknown> opts out of Mongoose's UpdateQuery<IDashboard> typing; a typo like setPayload.contianers = … compiles and silently no-ops → type as mongoose.UpdateQuery<IDashboard>['$set'] (or Partial<IDashboard>); the conditional-assign idiom still works.
  • P1 packages/common-utils/src/types.ts:1005-1015DashboardSchema.containers.refine(...) only checks duplicate container ids; tab-uniqueness and tile→container/tab resolution live only in the API-side superRefine. Internal write paths (DashboardWithoutIdSchema) and any future MCP write tool will accept dashboards the external API rejects → lift the four invariants into a shared validateDashboardLayout(data) helper in common-utils and call from both DashboardSchema.superRefine and buildDashboardBodySchema.
  • P1 packages/api/src/routers/external-api/v2/utils/dashboards.ts:776 — Early return on hasDuplicateContainerId silences disjoint tile-level errors (unknown containerId, tabId requires containerId), forcing a two-step fix loop → drop the early return; if kept, gate only the tile→container resolution branch (the tabId requires containerId check is independent).

P2 — worth addressing

  • P2 packages/common-utils/src/types.ts:899 — Tightening containerId/tabId to .min(1).optional() will reject any persisted tile with containerId: '' on next round-trip → run db.dashboards.find({"tiles.containerId":""}) before merge; if any rows exist, backfill or use .transform(s => s || undefined).
  • P2 packages/api/src/models/dashboard.ts:36containers: Schema.Types.Array is free-form; defense-in-depth rests on Zod alone → declare a typed Mongoose subdoc mirroring DashboardContainerSchema, or add a Mongoose validator.
  • P2 packages/api/src/routers/external-api/v2/dashboards.ts:39EXTERNAL_DASHBOARD_PROJECTION as const is not tied to IDashboard keys; dropping a model field leaves the projection silently stale → add satisfies mongoose.ProjectionType<IDashboard>.
  • P2 packages/api/src/routers/external-api/v2/utils/dashboards.ts:760seenTabIds.add(tab.id) runs unconditionally, inconsistent with the container loop's else-branch placement; functionally identical but reads as a bug → move into the else.
  • P2 packages/api/src/utils/externalApi.ts:36-208 vs v2/utils/dashboards.ts:303-313, 490-516 — Two parallel external↔internal tile translators each got containerId/tabId plumbed through; future tile-layout fields will need both touched → extract pickTileLayoutFields(tile) returning {id, x, y, w, h, name, containerId?, tabId?} and call from both.
  • P2 packages/api/src/utils/externalApi.ts:206 — No test exercises the legacy series-format translator with containers/tabs → add one fixture in the legacy suite to lock parity.
  • P2 packages/api/src/routers/external-api/v2/utils/dashboards.ts:806container.tabs?.find(t => t.id === tile.tabId) is O(T) per tile (T capped at 20 so OK now); since containerById is already built, attach a tabsById map per container during the first pass to make tile lookups O(1) → trivial restructure.
  • P2 packages/api/src/routers/external-api/v2/dashboards.ts:1191-1195, 1242-1246, 1293-1297containers JSDoc duplicated verbatim across Dashboard/CreateDashboardRequest/UpdateDashboardRequest → extract a DashboardContainersField component and $ref it (mirrors what DashboardContainerTab already does).
  • P2 packages/api/src/routers/external-api/v2/utils/dashboards.ts:786, 807 — Error messages echo client-supplied ids verbatim into 400 responses and Pino logs; with ids capped (see P1) the bloat risk reduces, but consider truncating echoed ids to ~64 chars regardless.

Verified non-issues

Reviewers run

kieran-typescript-reviewer, security-sentinel, performance-oracle, architecture-strategist, code-simplicity-reviewer.

alex-fedotyev pushed a commit that referenced this pull request May 6, 2026
Compound-review feedback on #2201:

- Tighten internal `TileSchema.containerId`/`tabId` to `min(1).optional()`
  so an empty string isn't a valid id (would otherwise silently pass
  `tile.containerId !== undefined` checks).
- Add `.max()` bounds on internal schemas: `id`/`title` capped at 256
  chars (`DashboardContainerSchema`, `DashboardContainerTabSchema`),
  `tabs` capped at `DASHBOARD_CONTAINER_MAX_TABS = 20`, and
  dashboard-level `containers` capped at `DASHBOARD_MAX_CONTAINERS =
  50`. The external API body schema now also caps `containers` so a
  client can't submit thousands of containers and trigger O(n*m) refine
  cost.
- Collapse the three sequential `containers.forEach` passes
  (container-id uniqueness, tab-id uniqueness, container-by-id map)
  into a single pass. The map is now built INSIDE the duplicate-id
  guard so duplicates aren't masked by last-write-wins. A new
  short-circuit returns before tile-resolution if container ids
  weren't unique, so the user fixes the container layer first instead
  of getting cascading "unknown containerId" errors on top.
- Extract `EXTERNAL_DASHBOARD_PROJECTION` constant in v2/dashboards.ts
  so the GET-list and GET-by-id projections stay in sync (this PR
  added `containers: 1` to both, the next field shouldn't have to).
- Add three missing test cases:
    - PUT-path duplicate-container-id rejection.
    - Tile with `containerId` set when the dashboard omits the
      `containers` field entirely (was previously a NPE-by-coincidence
      on `data.containers ?? []`).
    - Tile in a tabbed container that omits `tabId` (renders in the
      container shell, not under any tab); guards that the schema
      doesn't accidentally force `tabId` onto every tile in a tabbed
      container.

Cross-schema invariant lifting (the largest item the bot raised) is
deferred to a follow-up so this PR stays scoped to the external API
plus narrow internal-schema tightening.
@alex-fedotyev
Copy link
Copy Markdown
Contributor Author

P2 items addressed in 574ee8e:

  • Tightened internal TileSchema.containerId/tabId to min(1).optional()
  • .max() bounds added: 256-char ids/titles, 20 tabs/container, 50 containers/dashboard. The external API caps containers so a client cannot trigger O(n*m) refine cost
  • Single-pass over containers (container-id uniqueness + per-container tab-id uniqueness + map build), short-circuit if container-id duplicates exist before attempting tile resolution
  • EXTERNAL_DASHBOARD_PROJECTION constant in v2/dashboards.ts so the GET-list and GET-by-id projections stay in sync
  • Three missing tests added: PUT-path duplicate-container-id rejection, tile with containerId set when containers is omitted entirely (was a NPE-by-coincidence), tile in a tabbed container that omits tabId

Cross-schema invariant lifting + applyCommonTileFields deferred to follow-up #2225 (the lift would tighten DashboardSchema validation against existing internal writers, which needs its own scoped review and a production-data scan).

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

Deep Review

✅ No critical issues found.

The PR has already been through several compound-engineering review rounds (commits "tighten container/tab invariants per compound-review", "deep-review followups", "preserve containers on PUT and self-heal orphan refs"). Cross-reviewer attention turned up no P0/P1 defects, but agent-surface parity and a concurrent-PUT silent-self-heal interaction are worth surfacing.

🟡 P2 — recommended

  • packages/api/src/routers/external-api/v2/dashboards.ts:2041 — Two concurrent PUTs (one omitting containers, one explicitly []) leave Mongo with orphan tile→container refs because there is no updatedAt/__v guard, and the read-time self-heal in convertTileToExternalChart silently strips the orphan refs from the response so the data loss is invisible to clients.
    • Fix: Add an optimistic-concurrency filter on the findOneAndUpdate so the loser of a race gets 409, or surface a X-HyperDX-Healed-Refs response header listing tile ids whose refs were dropped.
    • adversarial
  • packages/api/src/mcp/tools/dashboards/saveDashboard.ts:42 — The MCP hyperdx_save_dashboard inputSchema does not expose containers, and the MCP update path does not preserve them like the v2 PUT does, so an agent calling the tool on a dashboard that already has containers silently flattens the layout — even though hyperdx_get_dashboard returns the new fields, breaking read/write parity.
  • packages/cli/src/cli.tsx:984hdx dashboards --json (advertised as LLM/agent output) hand-picks fields and drops containers plus per-tile containerId/tabId, so CLI-driven agents see a flat tile list and lose grouping. Not tracked in any follow-up.
    • Fix: Add containers to the JSON projection, include containerId/tabId in the per-tile object, and update the help-text schema block at lines 962-977.
    • agent-native
🔵 P3 nitpicks (6)
  • packages/common-utils/src/types.ts:1142 — Exported validateDashboardContainersConsistency has no production caller; only the unit-test file references it. The two underlying primitives (validateDashboardContainersStructure and validateDashboardTileContainerRefs) are what production uses, so the composite reads as load-bearing but is dead.
    • Fix: Delete the composite and retarget its test at the two primitives, or wire the composite into a real call site.
    • maintainability
  • packages/common-utils/src/types.ts:1172 — Block comment claims cross-tile validation is applied via validateDashboardContainersConsistency at the external-API request body schema, but the schema actually invokes the two primitives directly; the named helper is unused.
    • Fix: Update the comment to point at validateDashboardContainersStructure + validateDashboardTileContainerRefs, or wire the composite where the comment claims it lives.
    • maintainability
  • packages/common-utils/src/types.ts:1186DashboardSchema.containers inlines a superRefine that duplicates the container-id uniqueness check already implemented in validateDashboardContainersStructure, so the two can drift silently and the inline copy uses a slightly different issue path ([i, 'id'] vs [...containersPath, i, 'id']).
    • Fix: Replace the inline superRefine with validateDashboardContainersStructure(containers, ctx, { containersPath: [] }) so internal and external schemas stay in lockstep.
    • maintainability, kieran-typescript
  • packages/api/src/routers/external-api/v2/utils/dashboards.ts:825collectTileContainerRefIssues builds z.object({}).superRefine(...).safeParse({}) purely to obtain a RefinementCtx, then string-formats the resulting issues; the empty-schema parse contributes nothing and the inline (_, ctx) => shadows the lodash _ import at the top of the file.
    • Fix: Extract a non-Zod issue collector that both this helper and validateDashboardContainersConsistency route through, returning string[] directly.
    • maintainability, kieran-typescript
  • packages/api/src/routers/external-api/v2/dashboards.ts:1993 — On a PUT that omits containers, the handler falls back to existingDashboard.containers without running validateDashboardContainersStructure, and collectTileContainerRefIssues builds its lookup map via containers.map(c => [c.id, c]) (last-write-wins on duplicates); a legacy doc with duplicate container ids produces a misleading 'unknown tabId' 400 with no recovery path through the API.
    • Fix: Run the structure validator on effectiveContainers when sourced from the existing doc and short-circuit on hasDuplicateContainerId, surfacing a clear server-data-corruption error instead.
    • correctness, adversarial
  • packages/api/src/routers/external-api/v2/dashboards.ts:1707 — Handler-level tile-ref errors omit the Body validation failed: prefix that body-schema errors carry, so POST/PUT clients see two different error envelopes for what is conceptually one validation class.
    • Fix: Prefix the joined tile-ref message with Body validation failed: so the envelope is consistent across the schema and handler validation paths.
    • correctness

Reviewers (10): correctness, testing, maintainability, project-standards, agent-native, learnings-researcher, security, api-contract, adversarial, kieran-typescript.

Testing gaps:

  • No test exercises a PUT with explicit containers: [] against a dashboard that already has containers — only POST with [] is covered, leaving the if (containers !== undefined) branch in the PUT handler untested for the non-empty→empty transition.
  • Legacy series-format translator (packages/api/src/utils/externalApi.ts) now passes through containerId/tabId, but no integration test exercises this path.
  • Same tab.id reused across two different containers (which the per-container uniqueness scope implies should be allowed) is not pinned by a test, so a future scope change would not be caught.
  • DASHBOARD_MAX_CONTAINERS = 50 and DASHBOARD_CONTAINER_MAX_TABS = 20 boundaries are unenforced by tests; only the 500-tile cap is exercised.
  • A tile in container A whose tabId matches a tab only in container B is not distinguished from the generic 'unknown tabId' case.
  • No test covers the concurrent-PUT race or the read-time self-heal observability (orphan refs persisted via direct DB mutation).

alex-fedotyev added a commit that referenced this pull request May 7, 2026
Deep-review feedback on #2201, mechanical items from the May 7 pass:

- Cap external tile `containerId`/`tabId` at 256 chars to mirror the
  internal `DashboardContainer` schema. The constant
  `DASHBOARD_CONTAINER_ID_MAX` is now exported from
  `@hyperdx/common-utils` so the external schema and the internal one
  pull from one source of truth.
- Cap a single dashboard payload at 500 tiles via the new
  `DASHBOARD_MAX_TILES` constant. Without the cap, an external API
  caller could push a payload tens of MB into Mongo in one request.
- Type the PUT setPayload as `Partial<IDashboard>` instead of
  `Record<string, unknown>` so a misnamed field fails at compile time.
- Treat empty-string `containerId`/`tabId` on legacy Mongo docs as
  absent on read so dashboards predating the containers feature still
  round-trip through the now-stricter external schema (which enforces
  `min(1)`). Added a regression test that mutates Mongo directly to
  simulate the legacy state.
- Replace `pick(externalTile, [...])` in `convertToInternalTileConfig`
  with explicit destructuring (mirroring the pattern in
  `convertTileToExternalChart`). The picked `name` was a stale top-
  level field on the resulting Tile (Tile has no top-level `name`);
  the rendered config still carries the name on `config.name`.
- Extract `validateDashboardContainersConsistency` into
  `@hyperdx/common-utils/dist/types` so the canonical schema and the
  external-API request body schema agree on what a valid
  `{containers, tiles}` payload is. The external body's `superRefine`
  now delegates to the helper.
- Drop the export on `DASHBOARD_CONTAINER_MAX_TABS` (used only by the
  schema definition next to it).
- OpenAPI now publishes matching `maxLength: 256` on container/tab
  ids, `maxItems: 20` on `DashboardContainer.tabs`, `maxItems: 50` on
  the request `containers` array, and `maxItems: 500` on the request
  `tiles` array. Regenerated `openapi.json`.

Boundary tests cover 256-char ids vs 257, 500-tile payloads vs 501,
and the legacy empty-string read path. Helper has standalone unit
tests in `v2/utils/__tests__/dashboards.test.ts`.
alex-fedotyev and others added 5 commits May 8, 2026 00:19
…ontainer/tab refs (#2150)

PR #2015 added a dashboard organization layer (containers with optional
tabs, tiles join a container via containerId and a tab via tabId) but
the v2 external API was not updated to round-trip the new fields.
External integrations that build dashboards programmatically had no way
to use the new layer.

This wires the full set of fields through CREATE / GET / LIST / UPDATE.
Dashboards saved without containers round-trip unchanged (Mongoose
returns an empty array for missing containers, so the conversion only
emits the field when at least one container is present).

The body schema validates that:
- container ids are unique within the dashboard
- tab ids are unique within their container
- tile.containerId resolves to a real container
- tile.tabId resolves to a tab inside the tile's container
- tile.tabId requires tile.containerId to be set

Tests cover create + get round-trip, update round-trip with re-homing
tiles and dropping a container, optional-field defaults, all five
validation rejections, and the no-containers backward-compat case.

The conversion utilities also pick up containerId / tabId on the tile
itself: convertToInternalTileConfig now extends its pick list (was the
specific bug v2 of the plan missed) and the legacy series translator
in externalApi.ts also propagates the fields so both code paths
preserve them.

Refs #2150, follows up on #2015 (7665fbe).
Empty-string values previously passed per-field validation and only
hit the cross-field check (no container has id ''). Adding .min(1)
matches the shared DashboardContainerSchema pattern and surfaces a
field-level error instead.
After reading notes/principles/external-api-audit.md and walking
through the UI surface (useDashboardContainers.tsx, DashboardContainer.tsx,
GroupTabBar.tsx), three gaps were caught that the initial implementation
missed.

- OpenAPI parity: TileBase.containerId / tabId now declare minLength: 1
  to match the Zod schema's z.string().min(1).optional(). The Zod fix
  landed in the previous commit but the OpenAPI didn't pick up the
  constraint until JSDoc was updated and openapi.json regenerated.

- Test gap: explicit empty containers: [] now has its own round-trip
  test. The conversion normalizes [] back to absent on read (the
  existing length-guard makes this work), but the behavior wasn't
  asserted.

- Test gap: tile.containerId or tile.tabId set to an empty string is
  now explicitly rejected. Previously this would have failed
  cross-field validation only because no real container has id "",
  not because the tile-level rule fired.

UI invariants the API stays permissive about (auto-fixed by the UI
rather than rejected) are documented in the per-feature code map
under notes/repo-conventions/hyperdx/dashboards-containers-tabs.md
in the workspace.
Compound-review feedback on #2201:

- Tighten internal `TileSchema.containerId`/`tabId` to `min(1).optional()`
  so an empty string isn't a valid id (would otherwise silently pass
  `tile.containerId !== undefined` checks).
- Add `.max()` bounds on internal schemas: `id`/`title` capped at 256
  chars (`DashboardContainerSchema`, `DashboardContainerTabSchema`),
  `tabs` capped at `DASHBOARD_CONTAINER_MAX_TABS = 20`, and
  dashboard-level `containers` capped at `DASHBOARD_MAX_CONTAINERS =
  50`. The external API body schema now also caps `containers` so a
  client can't submit thousands of containers and trigger O(n*m) refine
  cost.
- Collapse the three sequential `containers.forEach` passes
  (container-id uniqueness, tab-id uniqueness, container-by-id map)
  into a single pass. The map is now built INSIDE the duplicate-id
  guard so duplicates aren't masked by last-write-wins. A new
  short-circuit returns before tile-resolution if container ids
  weren't unique, so the user fixes the container layer first instead
  of getting cascading "unknown containerId" errors on top.
- Extract `EXTERNAL_DASHBOARD_PROJECTION` constant in v2/dashboards.ts
  so the GET-list and GET-by-id projections stay in sync (this PR
  added `containers: 1` to both, the next field shouldn't have to).
- Add three missing test cases:
    - PUT-path duplicate-container-id rejection.
    - Tile with `containerId` set when the dashboard omits the
      `containers` field entirely (was previously a NPE-by-coincidence
      on `data.containers ?? []`).
    - Tile in a tabbed container that omits `tabId` (renders in the
      container shell, not under any tab); guards that the schema
      doesn't accidentally force `tabId` onto every tile in a tabbed
      container.

Cross-schema invariant lifting (the largest item the bot raised) is
deferred to a follow-up so this PR stays scoped to the external API
plus narrow internal-schema tightening.
Deep-review feedback on #2201, mechanical items from the May 7 pass:

- Cap external tile `containerId`/`tabId` at 256 chars to mirror the
  internal `DashboardContainer` schema. The constant
  `DASHBOARD_CONTAINER_ID_MAX` is now exported from
  `@hyperdx/common-utils` so the external schema and the internal one
  pull from one source of truth.
- Cap a single dashboard payload at 500 tiles via the new
  `DASHBOARD_MAX_TILES` constant. Without the cap, an external API
  caller could push a payload tens of MB into Mongo in one request.
- Type the PUT setPayload as `Partial<IDashboard>` instead of
  `Record<string, unknown>` so a misnamed field fails at compile time.
- Treat empty-string `containerId`/`tabId` on legacy Mongo docs as
  absent on read so dashboards predating the containers feature still
  round-trip through the now-stricter external schema (which enforces
  `min(1)`). Added a regression test that mutates Mongo directly to
  simulate the legacy state.
- Replace `pick(externalTile, [...])` in `convertToInternalTileConfig`
  with explicit destructuring (mirroring the pattern in
  `convertTileToExternalChart`). The picked `name` was a stale top-
  level field on the resulting Tile (Tile has no top-level `name`);
  the rendered config still carries the name on `config.name`.
- Extract `validateDashboardContainersConsistency` into
  `@hyperdx/common-utils/dist/types` so the canonical schema and the
  external-API request body schema agree on what a valid
  `{containers, tiles}` payload is. The external body's `superRefine`
  now delegates to the helper.
- Drop the export on `DASHBOARD_CONTAINER_MAX_TABS` (used only by the
  schema definition next to it).
- OpenAPI now publishes matching `maxLength: 256` on container/tab
  ids, `maxItems: 20` on `DashboardContainer.tabs`, `maxItems: 50` on
  the request `containers` array, and `maxItems: 500` on the request
  `tiles` array. Regenerated `openapi.json`.

Boundary tests cover 256-char ids vs 257, 500-tile payloads vs 501,
and the legacy empty-string read path. Helper has standalone unit
tests in `v2/utils/__tests__/dashboards.test.ts`.
@alex-fedotyev alex-fedotyev force-pushed the alex/HDX-2150-external-api-containers-tabs branch from 2e727a9 to 7b8b8ad Compare May 8, 2026 00:19
Deep-review feedback on #2201, P0/P1 + critical P2 items:

- Move tile-level container/tab ref resolution out of the request
  body schema and into the POST and PUT handlers. The schema-level
  superRefine called the helper with `data.containers ?? []`, which
  rejected any tile that referenced a real container when the PUT
  body omitted `containers` (the documented preserve-on-omit
  branch). The handler now resolves against the effective container
  set (body containers OR existing dashboard containers), so a PUT
  that updates only `tiles` and keeps a tile homed in a preserved
  container succeeds.
- Split `validateDashboardContainersConsistency` into a
  structure-only pass and a tile-ref-only pass; keep the composite
  for backward compatibility. The body schema now calls the
  structure-only helper; handlers run the tile-ref pass via a new
  `collectTileContainerRefIssues` wrapper that returns formatted
  `path: message` strings consistent with
  `validateRequestWithEnhancedErrors`.
- Self-heal orphan tile.containerId / tile.tabId on read. A doc may
  carry a tile pointing at a container that has since been removed,
  or a tab that no longer exists in its container; round-trip these
  as if the ref were absent so a subsequent PUT validates instead
  of failing schema with "Tile references unknown containerId".
  Each drop is logged with the dashboard id, tile id, and offending
  ref. The PUT projection now fetches `containers` from Mongo so
  the fallback can resolve.
- Document in OpenAPI that PUT does not support optimistic
  concurrency; concurrent PUTs may silently overwrite each other.
  Adding ETag-style concurrency would be a breaking change to the
  request shape and is left for a follow-up.

Tests:
- 4 integration tests at the request layer:
  - PUT that omits `containers` with tiles homed in preserved
    containers; expects 200 + containers preserved on response.
  - PUT that omits `containers` and references an unknown
    containerId; expects 400.
  - GET on a doc whose tile.containerId no longer matches; expects
    `containerId` and `tabId` absent on response.
  - GET on a doc whose tile.tabId no longer matches the container's
    tabs; expects `tabId` absent, `containerId` preserved.
- 3 unit tests on `collectTileContainerRefIssues` for the empty,
  unknown-containerId, and tabId-without-containerId paths.
- 4 unit tests on `convertToExternalDashboard` for each orphan-heal
  branch plus the full-resolution case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automerge review/tier-4 Critical — deep review + domain expert sign-off

Projects

None yet

Development

Successfully merging this pull request may close these issues.

External Dashboards API: expose container + tab fields

2 participants