feat(mcp): add dataframe capabilities and kroki svg renderer#20
feat(mcp): add dataframe capabilities and kroki svg renderer#20elasticdotventures wants to merge 6 commits intomasterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds an MCP-first dataframe/query layer (DataFusion + Prometheus) with source capability discovery, and extends chart flows to safely resolve and preview in-memory “virtual” datasets.
Changes:
- Introduces dataframe source adapters +
list_source_capabilitiesfor discoverability metadata. - Adds new MCP tools:
query_datafusion(Parquet/Arrow IPC/virtual datasets) andquery_prometheus(HTTP API → flattened rows, optional virtual dataset ingestion). - Extends chart validation + preview generation to support
virtual:{uuid}dataset identifiers via a virtual dataset bridge.
Reviewed changes
Copilot reviewed 51 out of 51 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit_tests/mcp_service/dataframe/tool/test_source_adapters.py | Tests DataFusion adapter lookup/capabilities + virtual dataset ID normalization. |
| tests/unit_tests/mcp_service/dataframe/tool/test_list_source_capabilities.py | Tests MCP capability discovery tool default + filtered outputs. |
| tests/unit_tests/mcp_service/dataframe/test_schemas.py | Adds schema validation tests for new request/response models. |
| tests/unit_tests/mcp_service/dataframe/test_registry.py | Removes insecure “list all without credentials” assertions. |
| tests/unit_tests/mcp_service/dataframe/test_identifiers.py | Tests virtual dataset identifier helper functions. |
| tests/unit_tests/mcp_service/chart/validation/test_dataset_validator.py | Tests chart dataset validation against virtual dataset schemas. |
| tests/unit_tests/mcp_service/chart/tool/test_generate_chart.py | Adds coverage for virtual:{uuid} dataset IDs in chart requests. |
| tests/unit_tests/mcp_service/chart/test_virtual_dataset_bridge.py | Tests SQL generation for virtual-dataset chart preview queries. |
| tests/unit_tests/mcp_service/chart/test_chart_utils.py | Ensures explore links are suppressed for virtual datasets. |
| superset/mcp_service/run_proxy.sh | Updates proxy runner to prefer venv python, else uv run python. |
| superset/mcp_service/explore/tool/generate_explore_link.py | Returns a warning/error payload for virtual datasets (no explore links). |
| superset/mcp_service/dataframe/tool/source_adapters.py | Adds DataFusion source adapter registry + capability metadata. |
| superset/mcp_service/dataframe/tool/remove_virtual_dataset.py | Normalizes virtual dataset IDs and consolidates session/user resolution. |
| superset/mcp_service/dataframe/tool/query_virtual_dataset.py | Centralizes SQL normalization/validation + consistent row/column conversion. |
| superset/mcp_service/dataframe/tool/query_prometheus.py | New Prometheus query tool with optional virtual dataset ingestion. |
| superset/mcp_service/dataframe/tool/query_datafusion.py | New DataFusion query tool with adapter-based source registration. |
| superset/mcp_service/dataframe/tool/list_virtual_datasets.py | Hardens listing behavior when session/user context is missing. |
| superset/mcp_service/dataframe/tool/list_source_capabilities.py | New capability discovery tool combining DataFusion + Prometheus metadata. |
| superset/mcp_service/dataframe/tool/ingest_dataframe.py | Uses shared session/user resolution; updates usage guidance for IDs. |
| superset/mcp_service/dataframe/tool/context.py | Adds resolve_session_and_user helper for consistent identity handling. |
| superset/mcp_service/dataframe/tool/common.py | Adds shared SQL validation + Arrow table row/column conversion helpers. |
| superset/mcp_service/dataframe/tool/init.py | Exposes new dataframe tools via package exports. |
| superset/mcp_service/dataframe/schemas.py | Adds schemas for Prometheus/DataFusion queries and capability discovery. |
| superset/mcp_service/dataframe/identifiers.py | Adds helpers for virtual:{uuid} detection/normalization/extraction. |
| superset/mcp_service/dataframe/init.py | Re-exports new dataframe schema models from package root. |
| superset/mcp_service/common/error_schemas.py | Allows dataset context IDs to be `int |
| superset/mcp_service/chart/virtual_dataset_bridge.py | Adds query builder + execution helper for virtual dataset chart previews. |
| superset/mcp_service/chart/validation/pipeline.py | Threads session/user context into dataset validation calls. |
| superset/mcp_service/chart/validation/dataset_validator.py | Resolves virtual datasets from registry for schema-based validation. |
| superset/mcp_service/chart/tool/generate_chart.py | Adds virtual dataset resolution + preview path (no saved charts/URLs). |
| superset/mcp_service/chart/preview_utils.py | Extracts generate_preview_from_data helper for re-use. |
| superset/mcp_service/chart/chart_utils.py | Suppresses explore URLs for virtual datasets; attempts temporal typing via registry. |
| superset/mcp_service/app.py | Wires new MCP tools + updates default instructions text. |
| superset/mcp_service/README.md | Updates local setup guidance to use uv venv. |
| superset-frontend/eslint-rules/eslint-plugin-theme-colors/index.js | Improves rule metadata + reporting; avoids duplicate warnings safely. |
| superset-frontend/eslint-rules/eslint-plugin-icons/no-fontawesome.test.js | Enables JSX parsing in rule tester. |
| superset-frontend/eslint-rules/eslint-plugin-icons/index.js | Handles JSX className values in literals and expression containers. |
| superset-frontend/eslint-rules/eslint-plugin-i18n-strings/index.js | Adds rule metadata; improves argument scanning + JSX safety checks. |
| docs/versioned_docs/version-6.0.0/installation/pypi.mdx | Switches venv creation command to uv venv. |
| docs/versioned_docs/version-6.0.0/contributing/howtos.mdx | Updates debugpy invocation to uv run python .... |
| docs/versioned_docs/version-6.0.0/contributing/development.mdx | Updates several Python commands to uv equivalents. |
| docs/scripts/generate-database-docs.mjs | Runs helper Python via uv run python. |
| docs/package.json | Runs Python helper scripts via uv run python. |
| docs/docs/installation/pypi.mdx | Switches venv creation command to uv venv. |
| docs/docs/contributing/howtos.mdx | Updates debugpy invocation to uv run python .... |
| docs/docs/contributing/development.mdx | Updates several Python commands to uv equivalents. |
| docs/developer_portal/contributing/howtos.md | Switches venv creation command to uv venv. |
| docs/developer_portal/contributing/development-setup.md | Updates several Python commands to uv equivalents. |
| docs/dataframe_subsystem_maturity_report.md | Trims stub doc content to avoid duplication (link preservation). |
| docker/docker-pytest-entrypoint.sh | Uses uv run python for DB readiness/reset scripts. |
| RELEASING/README.md | Switches venv creation command to uv venv. |
| metric_labels = {str(k): str(v) for k, v in metric.items()} | ||
| for point in series.get("values", []): | ||
| if not isinstance(point, list | tuple) or len(point) != 2: | ||
| continue | ||
| rows.append( |
There was a problem hiding this comment.
isinstance(..., list | tuple) is invalid at runtime (the list | tuple union is not a valid isinstance type tuple) and will raise TypeError, breaking Prometheus result flattening. Replace these checks with isinstance(..., (list, tuple)) (also update the similar check for result later in this function).
There was a problem hiding this comment.
@copilot open a new pull request to apply changes based on this feedback
| numeric = float(value) | ||
| if math.isfinite(numeric): | ||
| return numeric | ||
| except (TypeError, ValueError): |
There was a problem hiding this comment.
'except' clause does nothing but pass and there is no explanatory comment.
| raise RuntimeError( | ||
| "Virtual dataset missing during preview generation" | ||
| ) |
There was a problem hiding this comment.
This statement is unreachable.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Brian Horakh <35611074+elasticdotventures@users.noreply.github.com>
|
@elasticdotventures I've opened a new pull request, #22, to work on those changes. Once the pull request is ready, I'll request review from you. |
…list, tuple)) Co-authored-by: elasticdotventures <35611074+elasticdotventures@users.noreply.github.com>
fix: use valid isinstance tuple syntax for list/tuple checks in Prometheus result flattening
SUMMARY
This patch adds a new MCP-first dataframe/query layer and discoverability capabilities for additional data sources, centered on DataFusion (Parquet/Arrow/virtual datasets) and Prometheus.
Key additions:
query_datafusionquery_prometheuslist_source_capabilities(discoverability metadata for agent/tool routing)kroki_svg).BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
Not applicable (backend + MCP + plugin surface changes).
TESTING INSTRUCTIONS
.venv-pytest2/bin/pytest tests/unit_tests/mcp_service/dataframe/tool/test_list_source_capabilities.py tests/unit_tests/mcp_service/dataframe/test_schemas.py -q.venv-pytest2/bin/pytest tests/unit_tests/mcp_service/dataframe/tool/test_source_adapters.py tests/unit_tests/mcp_service/chart/test_virtual_dataset_bridge.py -q.venv-pytest2/bin/pytest tests/unit_tests/views/test_kroki_api.py -q.venv-pytest2/bin/pytest -xcss_templates) and pending migrations.pre-commit run --all-filesper repository guidance.yarn,ruff,pythonshim,helm-docs, frontend node deps), plus oxlint/prettier runtime incompatibilities.ADDITIONAL INFORMATION