Skip to content

JSON Schema default value deduction#2157

Open
stephenberry wants to merge 19 commits intomainfrom
jsonschema-defaults
Open

JSON Schema default value deduction#2157
stephenberry wants to merge 19 commits intomainfrom
jsonschema-defaults

Conversation

@stephenberry
Copy link
Copy Markdown
Owner

@stephenberry stephenberry commented Dec 18, 2025

Summary

Adds opt-in automatic default extraction for JSON Schema generation. When enabled, Glaze reads each primitive member's default-constructed value and emits it as the schema's "default" keyword. Off by default — C++ default-initialization and JSON Schema's "default" keyword have overlapping but not identical semantics, so making this implicit for every reflectable struct would silently change the meaning of every generated schema.

Usage

Opt in via a custom opts struct:

struct my_opts : glz::opts { bool schema_auto_defaults = true; };
auto schema = glz::write_json_schema<MyType, my_opts{}>();

Without the flag, schema output is unchanged from main.

What changes from main

  • schema_auto_defaults opt: new check_schema_auto_defaults(Opts) helper detects the bool on the caller's opts. Default is false, so existing call sites (and snapshot tests) are unaffected.
  • Primitive extraction when enabled: constructs T{} at compile time and reads each member's value. Fields of type bool, integer, floating-point, and std::monostate are emitted as "default". std::string, containers, and other non-representable types are skipped — their storage would not outlive the transient consteval T{}.
  • Value-init filter: a member is emitted only if its value differs from val_t{}. int x{}, bool enabled{false}, and double x{0.0} produce no "default" (they are sentinel-init, not deliberate recommendations); flag{true}, count{42}, ratio{3.14} still do. Users who specifically want "default":0 can set it explicitly via glz::json_schema<T>.
  • Single extraction per type: T{} is constructed once; all member defaults are pulled in one index-pack pass and stored in an inline constexpr auto cached_defaults<T> variable template, so repeated accesses are free.
  • Explicit overrides always win: an explicit glz::json_schema<T> with .defaultValue = ... takes precedence regardless of the opt-in flag.
  • Integrates with main's schema refactor: plugs into main's post-refactor code paths (inline bool/string primitives, anyOf for nullables, prefixItems for tuples, reference-counted single-use inlining).
  • Docs: docs/json-schema.md gets a new "Automatic Default Extraction (opt-in)" section describing the flag, the extraction scope, the value-init filter, and the explicit-override precedence.

Test plan

  • New auto_defaults, mixed_defaults, explicit_override, nested_defaults, value_init_defaults tests in jsonschema_test.cpp cover extraction, primitive filtering, explicit-override precedence, defaults on inlined single-use definitions, and the value-init filter — all go through the opt-in opts struct
  • Default-off test locks in the "no "default" emitted without opt-in" contract
  • Pre-existing schema tests across example_json, exceptions_test, json_reflection_test, json_test, and jsonschema_test produce identical output to main
  • Full local test suite passes (92 tests) on AppleClang

@packit-as-a-service
Copy link
Copy Markdown

@packit-as-a-service
Copy link
Copy Markdown

One of the tests failed for eae6c29. @admin check logs None, packit dashboard https://dashboard.packit.dev/jobs/srpm/574911 and external service dashboard https://copr.fedorainfracloud.org/coprs/build/10267423/

Adapts the automatic JSON Schema default-value extraction feature
to main's refactored schema type and reference-counted inlining:

- defaults are now applied to prop.defaultValue instead of ref_val
  (schematic was merged into schema on main).
- Tests updated to main's new output shape (inlined bool/string
  primitives, nullable anyOf, prefixItems tuples) plus the
  "default" keyword where primitive member extraction applies.
- nested_defaults test rewritten: auto_defaults is now inlined
  into the referencing property rather than appearing in $defs.
- Tests that rely on consteval struct construction remain guarded
  behind GLZ_HAS_CONSTEXPR_STRING.
- jsonschema_test replaces glz::detail::schematic (removed) with
  glz::schema.
GCC rejects passing a non-constexpr local (tied/instance from the
enclosing consteval function) as an argument to another consteval
function, since the call itself must satisfy constant-expression
argument rules.

Declaring the helpers constexpr (not consteval) lets them be invoked
with references to locals from within the consteval caller — the
entire invocation still happens at compile time because the outer
extract_all_defaults_impl remains consteval.

Clang accepted the previous form; GCC did not. Test coverage is
unchanged; the helpers are still only ever evaluated at compile time.
MSVC failed to convert the immediately-invoked lambda's deduced
return type back to std::array<std::optional<schema::schema_any>, N>,
emitting C2440 "cannot convert from initializer list" at the call
site. The if-constexpr branches — one returning extract_all_defaults
<V>() (a consteval call), the other a braced std::array{} — confused
the deduction.

Moving the logic into a named consteval helper (defaults_array_for)
gives MSVC a single, unambiguous return type per instantiation.
GCC -Wunused-but-set-variable fired when the can_extract_defaults
branch was taken, because N was declared before the if constexpr
and only read in the else branch.
C++ default-initialization and JSON Schema "default" have overlapping
but not identical semantics — std::string{}, int{}, double{} are
sentinel-initialized values, not recommendations to downstream
consumers. Silently emitting them as the schema's "default" changed
the meaning of every generated schema and broke snapshot tests.

Now the extraction fires only when the caller opts in:

    struct my_opts : glz::opts { bool schema_auto_defaults = true; };
    glz::write_json_schema<T, my_opts{}>();

check_schema_auto_defaults(Opts) threads the bool through to
defaults_array_for<T, Enable>(), which returns an all-nullopt array
when disabled. Pre-existing schema tests revert to their main output
(no "default":… keys). The new auto_defaults / mixed_defaults /
explicit_override / nested_defaults tests use an opt-in opts struct,
and a new test locks in the default-off behavior.
C++ default-init uses a sentinel (0, 0.0, false) that rarely matches
JSON Schema's "default" semantics of "recommended value when omitted".
Emitting "default":0 on every value-initialized int or every
default-constructed variant alternative is semantic noise.

Filter: skip emission when the member's value equals val_t{}. Keeps
the useful cases (flag{true}, count{42}, ratio{3.14}); drops the
ambiguous ones (int x{}, bool enabled{false}, double x{0.0}). Users
who genuinely want "default":0 can set it explicitly via
glz::json_schema<T>.

Adds a value_init_defaults test locking in the filter behavior.
The two overloads differed only in how members were accessed
(to_tie vs member-pointer). Combine into a single impl that
branches on glaze_object_t<T> vs reflectable<T> inside the
index-pack lambda — same scaffolding, one maintained copy.

glaze_object_t wins when a type satisfies both.
cached_defaults<T> is an inline constexpr variable template: its
initializer runs once per T, and any subsequent access at a call
site reads the cached value. Guarded by can_extract_defaults so it
is only ever instantiated for types the probe has already cleared.

The probe itself still calls extract_all_defaults_impl directly —
variable-template initializer failures are hard errors rather than
SFINAE (types like HoldsMapOfOptional satisfy the requires clause
but can't be default-constructed in a constant expression), so the
integral_constant<int, consteval_call> trick remains necessary.

Benefit: the single current call site now reads the cache instead
of re-invoking the extractor, and future multi-site access is free.
<version> is the C++20 header specifically designed for feature-test
macros and stdlib vendor identification — smaller surface than
<cstddef> and self-documenting about why we include it (we need
__GLIBCXX__ to materialize before the ABI check below).
- Reworded the struct comment: std::string_view *is* in schema_any;
  the real reason std::string defaults aren't extracted is that
  std::string isn't in is_schema_default_convertible, because its
  buffer wouldn't outlive the transient consteval T{}.
- Added "<< schema_str" to the expect() calls in the auto-default
  tests so failures surface the actual schema output for diagnosis.
Adds a section covering the opt-in flag, which types get extracted
(primitives only — bool/integers/floats/monostate), the value-init
filter (int x{} / bool{false} / double{} produce no "default"), and
the explicit-override escape hatch for users who want "default":0.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Can JSON schema defaults pull from the defaults on the struct if not otherwise specified?

1 participant