Skip to content

feat(mcp): add dataframe capabilities and kroki svg renderer#20

Open
elasticdotventures wants to merge 6 commits intomasterfrom
feat/mcp-dataframe-sources-and-capabilities
Open

feat(mcp): add dataframe capabilities and kroki svg renderer#20
elasticdotventures wants to merge 6 commits intomasterfrom
feat/mcp-dataframe-sources-and-capabilities

Conversation

@elasticdotventures
Copy link
Copy Markdown
Member

@elasticdotventures elasticdotventures commented Feb 7, 2026

SUMMARY

This patch adds a new MCP-first dataframe/query layer and discoverability capabilities for additional data sources, centered on DataFusion (Parquet/Arrow/virtual datasets) and Prometheus.

Key additions:

  • Introduces typed source abstractions and source identifier handling for dataframe-backed workflows.
  • Adds dedicated MCP tools for:
    • query_datafusion
    • query_prometheus
    • list_source_capabilities (discoverability metadata for agent/tool routing)
  • Adds source adapter plumbing for DataFusion source types and Prometheus source capability metadata.
  • Adds virtual dataset bridge integration so chart generation and preview paths can safely resolve virtual dataframe-backed datasets.
  • Extends schemas/error handling and MCP app/tool wiring for the new query + capability endpoints.
  • Updates MCP/docs/tooling references and associated tests.
  • Adds Kroki SVG diagram rendering support via sidecar-backed API and a new Superset visualization plugin (kroki_svg).

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Not applicable (backend + MCP + plugin surface changes).

TESTING INSTRUCTIONS

  1. Run unit tests for new MCP capability and dataframe schema paths:
    • .venv-pytest2/bin/pytest tests/unit_tests/mcp_service/dataframe/tool/test_list_source_capabilities.py tests/unit_tests/mcp_service/dataframe/test_schemas.py -q
  2. Run adapter/bridge tests:
    • .venv-pytest2/bin/pytest tests/unit_tests/mcp_service/dataframe/tool/test_source_adapters.py tests/unit_tests/mcp_service/chart/test_virtual_dataset_bridge.py -q
  3. Run Kroki API unit tests:
    • .venv-pytest2/bin/pytest tests/unit_tests/views/test_kroki_api.py -q
  4. Verify full-suite startup behavior:
    • .venv-pytest2/bin/pytest -x
    • In this environment, integration tests fail early with missing DB objects (css_templates) and pending migrations.
  5. Pre-commit:
    • Executed pre-commit run --all-files per repository guidance.
    • In this environment, several hooks fail due missing local toolchain/deps (yarn, ruff, python shim, helm-docs, frontend node deps), plus oxlint/prettier runtime incompatibilities.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

Copilot AI review requested due to automatic review settings February 7, 2026 09:00
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an MCP-first dataframe/query layer (DataFusion + Prometheus) with source capability discovery, and extends chart flows to safely resolve and preview in-memory “virtual” datasets.

Changes:

  • Introduces dataframe source adapters + list_source_capabilities for discoverability metadata.
  • Adds new MCP tools: query_datafusion (Parquet/Arrow IPC/virtual datasets) and query_prometheus (HTTP API → flattened rows, optional virtual dataset ingestion).
  • Extends chart validation + preview generation to support virtual:{uuid} dataset identifiers via a virtual dataset bridge.

Reviewed changes

Copilot reviewed 51 out of 51 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/unit_tests/mcp_service/dataframe/tool/test_source_adapters.py Tests DataFusion adapter lookup/capabilities + virtual dataset ID normalization.
tests/unit_tests/mcp_service/dataframe/tool/test_list_source_capabilities.py Tests MCP capability discovery tool default + filtered outputs.
tests/unit_tests/mcp_service/dataframe/test_schemas.py Adds schema validation tests for new request/response models.
tests/unit_tests/mcp_service/dataframe/test_registry.py Removes insecure “list all without credentials” assertions.
tests/unit_tests/mcp_service/dataframe/test_identifiers.py Tests virtual dataset identifier helper functions.
tests/unit_tests/mcp_service/chart/validation/test_dataset_validator.py Tests chart dataset validation against virtual dataset schemas.
tests/unit_tests/mcp_service/chart/tool/test_generate_chart.py Adds coverage for virtual:{uuid} dataset IDs in chart requests.
tests/unit_tests/mcp_service/chart/test_virtual_dataset_bridge.py Tests SQL generation for virtual-dataset chart preview queries.
tests/unit_tests/mcp_service/chart/test_chart_utils.py Ensures explore links are suppressed for virtual datasets.
superset/mcp_service/run_proxy.sh Updates proxy runner to prefer venv python, else uv run python.
superset/mcp_service/explore/tool/generate_explore_link.py Returns a warning/error payload for virtual datasets (no explore links).
superset/mcp_service/dataframe/tool/source_adapters.py Adds DataFusion source adapter registry + capability metadata.
superset/mcp_service/dataframe/tool/remove_virtual_dataset.py Normalizes virtual dataset IDs and consolidates session/user resolution.
superset/mcp_service/dataframe/tool/query_virtual_dataset.py Centralizes SQL normalization/validation + consistent row/column conversion.
superset/mcp_service/dataframe/tool/query_prometheus.py New Prometheus query tool with optional virtual dataset ingestion.
superset/mcp_service/dataframe/tool/query_datafusion.py New DataFusion query tool with adapter-based source registration.
superset/mcp_service/dataframe/tool/list_virtual_datasets.py Hardens listing behavior when session/user context is missing.
superset/mcp_service/dataframe/tool/list_source_capabilities.py New capability discovery tool combining DataFusion + Prometheus metadata.
superset/mcp_service/dataframe/tool/ingest_dataframe.py Uses shared session/user resolution; updates usage guidance for IDs.
superset/mcp_service/dataframe/tool/context.py Adds resolve_session_and_user helper for consistent identity handling.
superset/mcp_service/dataframe/tool/common.py Adds shared SQL validation + Arrow table row/column conversion helpers.
superset/mcp_service/dataframe/tool/init.py Exposes new dataframe tools via package exports.
superset/mcp_service/dataframe/schemas.py Adds schemas for Prometheus/DataFusion queries and capability discovery.
superset/mcp_service/dataframe/identifiers.py Adds helpers for virtual:{uuid} detection/normalization/extraction.
superset/mcp_service/dataframe/init.py Re-exports new dataframe schema models from package root.
superset/mcp_service/common/error_schemas.py Allows dataset context IDs to be `int
superset/mcp_service/chart/virtual_dataset_bridge.py Adds query builder + execution helper for virtual dataset chart previews.
superset/mcp_service/chart/validation/pipeline.py Threads session/user context into dataset validation calls.
superset/mcp_service/chart/validation/dataset_validator.py Resolves virtual datasets from registry for schema-based validation.
superset/mcp_service/chart/tool/generate_chart.py Adds virtual dataset resolution + preview path (no saved charts/URLs).
superset/mcp_service/chart/preview_utils.py Extracts generate_preview_from_data helper for re-use.
superset/mcp_service/chart/chart_utils.py Suppresses explore URLs for virtual datasets; attempts temporal typing via registry.
superset/mcp_service/app.py Wires new MCP tools + updates default instructions text.
superset/mcp_service/README.md Updates local setup guidance to use uv venv.
superset-frontend/eslint-rules/eslint-plugin-theme-colors/index.js Improves rule metadata + reporting; avoids duplicate warnings safely.
superset-frontend/eslint-rules/eslint-plugin-icons/no-fontawesome.test.js Enables JSX parsing in rule tester.
superset-frontend/eslint-rules/eslint-plugin-icons/index.js Handles JSX className values in literals and expression containers.
superset-frontend/eslint-rules/eslint-plugin-i18n-strings/index.js Adds rule metadata; improves argument scanning + JSX safety checks.
docs/versioned_docs/version-6.0.0/installation/pypi.mdx Switches venv creation command to uv venv.
docs/versioned_docs/version-6.0.0/contributing/howtos.mdx Updates debugpy invocation to uv run python ....
docs/versioned_docs/version-6.0.0/contributing/development.mdx Updates several Python commands to uv equivalents.
docs/scripts/generate-database-docs.mjs Runs helper Python via uv run python.
docs/package.json Runs Python helper scripts via uv run python.
docs/docs/installation/pypi.mdx Switches venv creation command to uv venv.
docs/docs/contributing/howtos.mdx Updates debugpy invocation to uv run python ....
docs/docs/contributing/development.mdx Updates several Python commands to uv equivalents.
docs/developer_portal/contributing/howtos.md Switches venv creation command to uv venv.
docs/developer_portal/contributing/development-setup.md Updates several Python commands to uv equivalents.
docs/dataframe_subsystem_maturity_report.md Trims stub doc content to avoid duplication (link preservation).
docker/docker-pytest-entrypoint.sh Uses uv run python for DB readiness/reset scripts.
RELEASING/README.md Switches venv creation command to uv venv.

Comment thread superset/mcp_service/chart/virtual_dataset_bridge.py Outdated
Comment on lines +99 to +103
metric_labels = {str(k): str(v) for k, v in metric.items()}
for point in series.get("values", []):
if not isinstance(point, list | tuple) or len(point) != 2:
continue
rows.append(
Copy link

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isinstance(..., list | tuple) is invalid at runtime (the list | tuple union is not a valid isinstance type tuple) and will raise TypeError, breaking Prometheus result flattening. Replace these checks with isinstance(..., (list, tuple)) (also update the similar check for result later in this function).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback

numeric = float(value)
if math.isfinite(numeric):
return numeric
except (TypeError, ValueError):
Copy link

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Copilot uses AI. Check for mistakes.
Comment on lines +485 to +487
raise RuntimeError(
"Virtual dataset missing during preview generation"
)
Copy link

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This statement is unreachable.

Copilot uses AI. Check for mistakes.
@elasticdotventures elasticdotventures changed the title feat(mcp): add dataframe source adapters and capability discovery feat(mcp): add dataframe capabilities and kroki svg renderer Feb 7, 2026
elasticdotventures and others added 2 commits February 22, 2026 15:29
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Brian Horakh <35611074+elasticdotventures@users.noreply.github.com>
Copy link
Copy Markdown

Copilot AI commented Feb 22, 2026

@elasticdotventures I've opened a new pull request, #22, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot AI and others added 2 commits February 22, 2026 04:31
…list, tuple))

Co-authored-by: elasticdotventures <35611074+elasticdotventures@users.noreply.github.com>
fix: use valid isinstance tuple syntax for list/tuple checks in Prometheus result flattening
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants