feat(ide/config): add unified config support to the static-analyzer server#923
Conversation
…erver As part of the work described in [Section 4 — Prerequisite: Unified Config File Support](https://docs.google.com/document/d/1L4rdpTwafiVxFn1HIlSw0f6cJ61GihDOzMFvZwuzlMw/edit?tab=t.0#heading=h.mr90lg23mmfd), the static-analyzer server needs to understand both the legacy `static-analysis.datadog.yml` format and the new unified `code-security.datadog.yaml` format (schema version `v1.0`), so the VS Code extension can drive mutations against whichever file the user has. The dd-source backend already supports `schema_version=v1` (unified) on `POST /api/v2/static-analysis/config/client`. This PR covers only the **local static-analyzer server** side — the VS Code extension changes will follow in a separate PR. The server had two issues that would prevent it from being usable as a unified-config host: 1. **Production parsing was locked to legacy.** `from_yaml_content` called `parse_any_schema_yaml` only under `cfg!(test)` and `file_legacy::parse_yaml` in release builds. A `code-security.datadog.yaml` sent to the server would silently be parsed as legacy content — or fail without a useful error — making every mutation endpoint incorrect for unified files. 2. **No format signal on mutation/validation requests.** Endpoints like `ignore-rule`, `add-rulesets`, and `can-onboard` had no way for the caller to declare what format the file is in. Without that signal, the server could not validate that the content matches the declared filename format (matching CLI behaviour), could not pick the right empty-file default when creating a new config, and could not produce the correct YAML serialization on write-back. 3. **Blank-content edge case.** When an existing but empty `code-security.datadog.yaml` (e.g. just created, content = `base64("\n")`) was sent to `add-rulesets`, `from_yaml_content`'s early-return for blank content always yielded a legacy default, which then failed `validate_format(Unified)` with `SchemaMismatch` even though there was nothing wrong. Following the design in the RFC above, format is signalled via an optional `schema_version` field added to the JSON body of existing endpoints. This avoids proliferating product-scoped routes: the IDE determines format from the filename, sends it as `"v1"` (unified) or omits it (legacy), and the server validates accordingly. **Backward compatibility is a hard requirement.** Old extensions that never send `schema_version` must continue to work unchanged. This is guaranteed by: - `schema_version` is `#[serde(default)] Option<String>` on every request struct — missing field deserializes as `None`. - `None` → `ConfigFormat::Legacy` everywhere. - No `#[serde(deny_unknown_fields)]` on request structs, so a new extension talking to an old binary simply has its `schema_version` field ignored. - **Un-gated dual-format parsing.** Removed the `cfg!(test)` guard; production now always calls `parse_any_schema_yaml`, which detects the actual schema variant and stores it in `WithVersion<Legacy, CodeSecurity>`. - **`ConfigFormat` enum + `detected_format()` accessor.** `ConfigFormat::Legacy` / `ConfigFormat::Unified` derived from the `WithVersion` discriminant. Used everywhere format validation is needed. - **`validate_format(declared)`** method returns `Err(SchemaMismatch)` when detected ≠ declared, mirroring CLI filename-strict behaviour. - **`default_unified()` constructor.** Returns an empty `WithVersion::CodeSecurity` config — used when `add-rulesets` is called with no existing content and `schema_version=v1`. - **`WithHint<T>` wrapper + new `TryFrom` impls.** Replaces the old format-unaware `TryFrom<String>` / `TryFrom<Base64String>`. `WithHint<T>(T, Option<ConfigFormat>)` carries the caller's declared format so `from_yaml_content` can pick the right empty default. Callers with `schema_version` pass `Some(format)`; deprecated/legacy routes pass `None`. - **Bug fix — blank content edge case.** `from_yaml_content` now accepts `Option<ConfigFormat> hint`; when content is blank it returns `default_unified()` for `Some(Unified)` and the legacy default otherwise. This prevents `add-rulesets` and `ignore-rule` from failing with `SchemaMismatch` when a `code-security.datadog.yaml` exists but is empty. - **`with_ignored_rule` and `with_added_rulesets`** updated to forward `declared_format` via `WithHint`. Added `#[serde(default)] schema_version: Option<String>` to: - `IgnoreRuleRequest` - `AddRuleSetsRequest` - `CanOnboardRequest` - `ParseConfigRequest` - **`post_parse_config`** — now accepts `schema_version`; constructs via `WithHint(configuration, Some(format))`, then calls `validate_format` — returns `400` on mismatch. - **`post_can_onboard_v2`** — now accepts `schema_version`; same parse + validate pattern; returns `500` on mismatch (existing error path the client already treats as blocking). - **`post_ignore_rule` / `add_rulesets`** — delegate to the updated `with_ignored_rule` / `with_added_rulesets` which now carry the format hint. - **Deprecated routes** (`get_can_onboard`, `get_rulesets`) updated to `WithHint(..., None)` — legacy-default, no behaviour change. Added `SchemaMismatch` variant (`code: 4`) to `ConfigFileError`. Extended the test suite with coverage for: - Strict parse: unified content + `schema_version=v1` → 200; legacy content + `schema_version=v1` → 400 (`SchemaMismatch`); unified content + no `schema_version` → 400. - Mutation (`add-rulesets`, `ignore-rule`) format validation. - `can-onboard` with `schema_version`. - Backward compat: legacy content + no `schema_version` → behaves as today. - Empty content + `schema_version=v1` → creates a unified default file. Added `Default` impl for `YamlConfigFile`, needed by `default_unified()`. `cargo test # 77 tests, all pass cargo fmt -- --check # clean cargo clippy -- -D warnings` The VS Code extension changes (unified-first discovery, sending `schema_version` in request bodies, binary version gate, etc.) are the next step and will be in a separate PR against the `datadog-vscode` repo. K9CODESEC-2169 K9CODESEC-2303
|
🎯 Code Coverage (details) 🔗 Commit SHA: 584828a | Docs | Datadog PR Page | Give us feedback! |
There was a problem hiding this comment.
Pull request overview
Updates the local datadog_static_analyzer_server IDE configuration pipeline to support both legacy static-analysis.datadog.* YAML and the unified code-security.datadog.yaml (v1.0) format, including strict format validation based on an IDE-declared schema_version.
Changes:
- Parse legacy and unified schemas in production (no longer test-gated), track detected format, and validate it against a declared format hint.
- Extend IDE server request models/endpoints to accept optional
schema_versionand propagate format hints through parsing/mutation paths (including correct defaults for blank content). - Add a
SchemaMismatcherror and expand IDE integration/unit tests to cover strict validation + unified mutations.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| crates/static-analysis-kernel/src/config/file_v1.rs | Adds a Default unified config value (schema-version: v1.0) to support “create new unified file” flows. |
| crates/bins/src/bin/datadog_static_analyzer_server/ide/mod.rs | Adds integration tests for schema-version strict validation and unified mutation behavior. |
| crates/bins/src/bin/datadog_static_analyzer_server/ide/configuration_file/static_analysis_config_file.rs | Introduces ConfigFormat, format-aware parsing via WithHint, unified defaults, and schema mismatch validation. |
| crates/bins/src/bin/datadog_static_analyzer_server/ide/configuration_file/models.rs | Adds #[serde(default)] schema_version: Option<String> to relevant request structs for backward compatibility. |
| crates/bins/src/bin/datadog_static_analyzer_server/ide/configuration_file/error.rs | Adds SchemaMismatch error variant and assigns it a stable numeric code. |
| crates/bins/src/bin/datadog_static_analyzer_server/ide/configuration_file/endpoints.rs | Wires schema_version through routes, performs strict validation for parse/can-onboard, and updates deprecated route parsing. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Fixed misleading doc comment on `get_can_onboard`: the `content` route parameter is base64-encoded YAML, not a filesystem path. - Updated `to_response_result` to return `400 Bad Request` for `SchemaMismatch` (a client input error) and keep `500 Internal Server Error` for genuine server failures (`Parser`, `Decoder`, `CommentReconciler`). - Updated the `ignore_rule_rejects_schema_version_mismatch` integration test to assert `BadRequest` instead of `InternalServerError`. K9CODESEC-2303
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
…alue - Added "UNIFIED" as an accepted schema_version string for ConfigFormat::Unified alongside the legacy "v1" value, allowing newer extension versions to use the canonical name - Kept "v1" mapped to ConfigFormat::Unified for backward compatibility with older extension versions that already send it - Updated doc comments in static_analysis_config_file.rs and the add_rulesets helper to reflect both accepted values - Removed the incorrect #[deprecated(note = "IDEs stopped supporting this flow")] annotation from post_add_rulesets_v2 — the VS Code extension actively calls POST /ide/v2/config/add-rulesets for both legacy and unified config mutations K9CODESEC-2303
…ical - Reverted schema_version match arm back to only "v1" for ConfigFormat::Unified (removed the "UNIFIED" alias that was added in the prior commit) - Updated doc comment in ConfigFormat::From impl to reflect that "v1" is the sole canonical wire value, matching what the VS Code extension sends via CONFIG_FORMAT_INFO.UNIFIED.schemaVersion - Restored doc comments in default_unified and add_rulesets helper to reference "v1" instead of "UNIFIED" K9CODESEC-2303
- Changed SchemaMismatch error mapping from 400 Bad Request to 422 Unprocessable Entity in to_response_result helper and post_parse_config handler — the content is syntactically valid YAML but semantically wrong (format declared does not match format detected) - Updated integration tests in mod.rs to assert 422 instead of 400 for all three schema-mismatch scenarios: ignore-rule mismatch, parse-config mismatch, and add-rulesets mismatch - Updated doc comment on to_response_result to reflect the corrected HTTP status and rationale K9CODESEC-2303
| content_str = content_str.replace("\\", "/"); | ||
| } | ||
| let config = StaticAnalysisConfigFile::try_from(WithHint(Base64String(content_str), None)) | ||
| .map_err(|e| Custom(Status::InternalServerError, e))?; |
There was a problem hiding this comment.
Sounds like from the description that this 500 is intentional. Would you please add a comment explaining why? I would otherwise wonder why it's not 400 like elsewhere
There was a problem hiding this comment.
Good catch! You're right, it should be 400 and I've corrected it. Any error on this route is a parse failure on client-supplied content (bad base64 or invalid YAML), so 500 was semantically wrong. I initially kept the original 500 to avoid inadvertently breaking the Visual Studio extension (the only known caller of this deprecated route), but after checking its source I confirmed it uses a plain try/catch with no status-code inspection, so the change is safe. A comment has been added to the code explaining this. Thanks for flagging it!
|
|
||
| config | ||
| .validate_format(format) | ||
| .map_err(|e| Custom(Status::InternalServerError, e))?; |
There was a problem hiding this comment.
And maybe a comment why this isn't 422?
There was a problem hiding this comment.
Fixed, changed to 422 Unprocessable Entity, consistent with every other SchemaMismatch mapping in the file. I initially kept the original 500 behavior here as well out of caution, to avoid breaking older clients. After verifying that the Visual Studio extension (the only client calling the deprecated v1 route) does not inspect specific HTTP status codes, and that VS Code's canOnboard uses a generic try/catch with no status-code check either, we confirmed the change is safe to make.
| config | ||
| } | ||
| None => match declared_format { | ||
| ConfigFormat::Unified => Self::default_unified(), |
There was a problem hiding this comment.
Seeing this here made me wonder: should we change the impl Default for StaticAnalysisConfigFile to return a unified file by default and have "default_legacy" be the exception?
Only thinking if we end up having other places to create these structs, it might not be intuitive why the Default impl for a legacy format. Up to you
There was a problem hiding this comment.
Great point, done. Default now returns a unified file, and default_legacy() is the explicitly named fallback used only for the backward-compat paths (blank content with no declared format, and ConfigFormat::Legacy in with_added_rulesets).
- Changed get_can_onboard (v1 GET) parse failure from 500 to 400 Bad Request — errors here are always caused by client-supplied bad base64 or invalid YAML; confirmed safe by inspecting the Visual Studio extension (DataDog/datadog-visual-studio), which wraps the call in a plain try/catch and does not inspect the specific status code - Changed post_can_onboard_v2 schema mismatch from 500 to 422 Unprocessable Entity — content is syntactically valid but semantically wrong (declared format does not match detected format), consistent with all other SchemaMismatch mappings in the file - Added explanatory comment on get_can_onboard clarifying why 400 is correct and that SchemaMismatch cannot occur on this route (no format declared) - Updated integration tests in mod.rs to assert the corrected status codes for both endpoints K9CODESEC-2303
…n unified - Changed impl Default to produce a unified (CodeSecurity) file instead of legacy, reflecting that unified is the standard going forward and legacy is the backward-compat exception - Renamed default_unified() to default_legacy() — Default::default() now serves the unified case, making legacy the explicitly named fallback - Updated from_yaml_content blank-content path: None/Legacy hint now calls default_legacy(); Unified hint falls through to Default - Updated with_added_rulesets None-content path: Unified now calls Default; Legacy now calls default_legacy() - Updated WithHint doc comment to reflect that None hint routes blank content to default_legacy - Replaced try_from_returns_default_config_file_if_empty_string test with four focused tests that document all blank-content hint combinations (None, Legacy, Unified) and assert Default is unified K9CODESEC-2303
albertvaka
left a comment
There was a problem hiding this comment.
Stamping so it can be merged, I have not reviewed it though.
MarshalX
left a comment
There was a problem hiding this comment.
Stamping so it can be merged, I have not reviewed it though.
Context
As part of the work described in Section 4 — Prerequisite: Unified Config File Support, the static-analyzer server needs to understand both the legacy
static-analysis.datadog.ymlformat and the new unifiedcode-security.datadog.yamlformat (schema versionv1.0), so the VS Code extension can drive mutations against whichever file the user has.The dd-source backend already supports
schema_version=v1(unified) onPOST /api/v2/static-analysis/config/client. This PR covers only the local static-analyzer server side — the VS Code extension changes will follow in a separate PR.Problem
The server had two issues that would prevent it from being usable as a unified-config host:
Production parsing was locked to legacy.
from_yaml_contentcalledparse_any_schema_yamlonly undercfg!(test)andfile_legacy::parse_yamlin release builds. Acode-security.datadog.yamlsent to the server would silently be parsed as legacy content — or fail without a useful error — making every mutation endpoint incorrect for unified files.No format signal on mutation/validation requests. Endpoints like
ignore-rule,add-rulesets, andcan-onboardhad no way for the caller to declare what format the file is in. Without that signal, the server could not validate that the content matches the declared filename format (matching CLI behaviour), could not pick the right empty-file default when creating a new config, and could not produce the correct YAML serialization on write-back.Blank-content edge case. When an existing but empty
code-security.datadog.yaml(e.g. just created, content =base64("\n")) was sent toadd-rulesets,from_yaml_content's early-return for blank content always yielded a legacy default, which then failedvalidate_format(Unified)withSchemaMismatcheven though there was nothing wrong.Approach: Option A —
schema_versionfield on existing routes, no new routesFollowing the design in the RFC above, format is signalled via an optional
schema_versionfield added to the JSON body of existing endpoints. This avoids proliferating product-scoped routes: the IDE determines format from the filename, sends it as"v1"(unified) or omits it (legacy), and the server validates accordingly.Backward compatibility is a hard requirement. Old extensions that never send
schema_versionmust continue to work unchanged. This is guaranteed by:schema_versionis#[serde(default)] Option<String>on every request struct — missing field deserializes asNone.None→ConfigFormat::Legacyeverywhere.#[serde(deny_unknown_fields)]on request structs, so a new extension talking to an old binary simply has itsschema_versionfield ignored.What changed
static_analysis_config_file.rscfg!(test)guard; production now always callsparse_any_schema_yaml, which detects the actual schema variant and stores it inWithVersion<Legacy, CodeSecurity>.ConfigFormatenum +detected_format()accessor.ConfigFormat::Legacy/ConfigFormat::Unifiedderived from theWithVersiondiscriminant. Used everywhere format validation is needed.validate_format(declared)method returnsErr(SchemaMismatch)when detected ≠ declared, mirroring CLI filename-strict behaviour.default_unified()constructor. Returns an emptyWithVersion::CodeSecurityconfig — used whenadd-rulesetsis called with no existing content andschema_version=v1.WithHint<T>wrapper + newTryFromimpls. Replaces the old format-unawareTryFrom<String>/TryFrom<Base64String>.WithHint<T>(T, Option<ConfigFormat>)carries the caller's declared format sofrom_yaml_contentcan pick the right empty default. Callers withschema_versionpassSome(format); deprecated/legacy routes passNone.from_yaml_contentnow acceptsOption<ConfigFormat> hint; when content is blank it returnsdefault_unified()forSome(Unified)and the legacy default otherwise. This preventsadd-rulesetsandignore-rulefrom failing withSchemaMismatchwhen acode-security.datadog.yamlexists but is empty.with_ignored_ruleandwith_added_rulesetsupdated to forwarddeclared_formatviaWithHint.models.rsAdded
#[serde(default)] schema_version: Option<String>to:IgnoreRuleRequestAddRuleSetsRequestCanOnboardRequestParseConfigRequestendpoints.rspost_parse_config— now acceptsschema_version; constructs viaWithHint(configuration, Some(format)), then callsvalidate_format— returns400on mismatch.post_can_onboard_v2— now acceptsschema_version; same parse + validate pattern; returns500on mismatch (existing error path the client already treats as blocking).post_ignore_rule/add_rulesets— delegate to the updatedwith_ignored_rule/with_added_rulesetswhich now carry the format hint.get_can_onboard,get_rulesets) updated toWithHint(..., None)— legacy-default, no behaviour change.error.rsAdded
SchemaMismatchvariant (code: 4) toConfigFileError.mod.rs(IDE integration tests)Extended the test suite with coverage for:
schema_version=v1→ 200; legacy content +schema_version=v1→ 400 (SchemaMismatch); unified content + noschema_version→ 400.add-rulesets,ignore-rule) format validation.can-onboardwithschema_version.schema_version→ behaves as today.schema_version=v1→ creates a unified default file.file_v1.rsAdded
Defaultimpl forYamlConfigFile, needed bydefault_unified().Validation
cargo test->? 77 tests, all passcargo fmt -- --check-> cleancargo clippy -- -D warnings-> cleanRetrocompatibility: Tested this binary with the current extension and it works.
What's not in this PR
The VS Code extension changes (unified-first discovery, sending
schema_versionin request bodies, binary version gate, etc.) are the next step and will be in a separate PR against thedatadog-vscoderepo.K9CODESEC-2169
K9CODESEC-2303