Skip to content

feat: LLM check type support in contract JSON schema#2613

Draft
LaurenDebruyn wants to merge 11 commits intomainfrom
feat/llm-checks-and-remediation
Draft

feat: LLM check type support in contract JSON schema#2613
LaurenDebruyn wants to merge 11 commits intomainfrom
feat/llm-checks-and-remediation

Conversation

@LaurenDebruyn
Copy link
Copy Markdown

@LaurenDebruyn LaurenDebruyn commented Mar 7, 2026

Summary

  • Adds JSON schema definition for the new llm check type in data contracts (prompt, threshold, references)
  • Adds references property to the llm check schema for reference dataset lookups during LLM validation

Note: Remediation contract parsing was added and then removed — remediation is handled entirely by the remediation-cli (Gravity), not soda-core.

Companion PR: sodadata/soda-extensions#298 — feat/llm-checks-and-remediation (soda-llm package with tools, reference lookup, check implementation)

Test plan

  • JSON schema is valid JSON
  • CI green on matching branch with soda-extensions

🤖 Generated with Claude Code

Add core infrastructure for the new LLM check type:
- JSON schema definition for llm check in data contracts
- Default check name registration ("LLM validation passes")
- Remediation block parsing (sql/llm strategies with references and tools)
- Contract verification model extensions for remediation metadata
- Unit tests for remediation YAML parsing

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@LaurenDebruyn LaurenDebruyn self-assigned this Mar 7, 2026
LaurenDebruyn and others added 6 commits March 9, 2026 18:59
…lasses

Remediation is not used by Soda Core at runtime, so there's no need to
parse its internal structure. Store it as an opaque dict for downstream
consumers to interpret.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Copy the latest schema from soda-server which includes remediation,
reconciliation, diagnostics, failed_rows, check_attributes, and
updated descriptions with doc links.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Remediation must be properly parsed so it can be ingested by Soda Cloud
when contract verification results are pushed. Restore the structured
dataclasses for parsing and add remediation serialization to the check
result cloud dict.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Remediation is handled by the remediation-cli (Gravity), which parses
the YAML itself.  Keeping parsing in soda-core added dead code that
nothing consumed at runtime.

Removed:
- RemediationYaml / RemediationStrategyYaml dataclasses and _parse_remediation()
- Remediation / RemediationStrategy / RemediationReference / RemediationTool from Check
- _build_remediation() in CheckImpl and _build_remediation_cloud_dict() in SodaCloud
- $defs/remediation and all 14 check-type references in the JSON schema
- test_remediation_parsing.py

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@LaurenDebruyn LaurenDebruyn changed the title feat: LLM check type support and contract remediation parsing feat: LLM check type support in contract JSON schema Mar 10, 2026
LaurenDebruyn and others added 4 commits March 10, 2026 17:43
Rebuild JSON schema from main + only the LLM check type definition.
Removes the large formatting/description rewrite that was obscuring
the actual change in the PR diff.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
The schema is not used by soda-core at runtime (json_schema_verifier.py
is fully commented out). Schema updates belong in the soda-webapp repo.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant