Skip to content

Ability to use multiple validators#352

Merged
reknih merged 5 commits into
LaurenzV:mainfrom
reknih:multi-validators
May 8, 2026
Merged

Ability to use multiple validators#352
reknih merged 5 commits into
LaurenzV:mainfrom
reknih:multi-validators

Conversation

@reknih
Copy link
Copy Markdown
Collaborator

@reknih reknih commented Apr 18, 2026

This PR adds the ability to combine validators, in particular, it will enable the sought-after combination PDF/A-2a and PDF/UA-1.

Closes #101

@reknih reknih force-pushed the multi-validators branch from 441b1a5 to b9ed113 Compare April 18, 2026 12:21
@reknih
Copy link
Copy Markdown
Collaborator Author

reknih commented Apr 18, 2026

For the veraPDF test failure, see veraPDF/veraPDF-library#1588

@reknih reknih force-pushed the multi-validators branch from b9ed113 to 85959ec Compare April 18, 2026 12:47
@LaurenzV
Copy link
Copy Markdown
Owner

Thanks! As it happens, I was thinking about taking this task on myself. I want to spend some more time thinking about whether the validators can be represented a bit more elegantly, I might either push to this PR or create a new one with the tests you added!

@saecki
Copy link
Copy Markdown
Collaborator

saecki commented Apr 18, 2026

Not sure if you meant using a bitset for the validators, but I think that would be nice :)

@reknih
Copy link
Copy Markdown
Collaborator Author

reknih commented Apr 19, 2026

I considered eagerly collecting all validators into a shared policy object to a) only iterate them once and b) encapsulate how their policies fold with regards to any/all and negations.

However, since in practice the validator vector will have at most 2 items, the iterations are not too bad. Having the policy folding in one place would be nice but I eschewed that in pursuit of a minimal change (or laziness).

I don't know how useful a bit field would be since (at least with how policies are expressed now) ORing them does not yield a meaningful result.

@LaurenzV
Copy link
Copy Markdown
Owner

I updated veraPDF in CI, so hopefully it should work now?

@laurmaedje laurmaedje requested a review from saecki April 20, 2026 07:20
@saecki
Copy link
Copy Markdown
Collaborator

saecki commented Apr 20, 2026

I don't know how useful a bit field would be since (at least with how policies are expressed now) ORing them does not yield a meaningful result.

The bitset would only be used internally to have a cheaply copyable type that can hold all standards. Though the proposed design in #355 from @LaurenzV would probably be nicer for this too.
The reason I'd want to have a cheaply copyable type would be, so that ValidationErrors could include the standard/validator that caused the failure. With your proposed ValidationPolicy struct, each policy would then hold this struct and the validation error could just copy it. With the current design, it would be constructed ad-hoc instead of doing the any/all iteration over validators, basically doing a filter_map().collect().

@reknih reknih force-pushed the multi-validators branch from 85959ec to 4521d5b Compare April 20, 2026 08:50
@reknih
Copy link
Copy Markdown
Collaborator Author

reknih commented May 4, 2026

I will attempt to build a ValidationPolicy approach over the course of the next ten days.

@reknih reknih force-pushed the multi-validators branch from 4521d5b to 8a9ec04 Compare May 6, 2026 14:46
@reknih
Copy link
Copy Markdown
Collaborator Author

reknih commented May 8, 2026

I have updated the PR with a new design merging the three approaches, with a heavy dose of #355. See the commit messages for a more in-depth design rationale.

Copy link
Copy Markdown
Collaborator

@saecki saecki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I think having the offending validators will save a lot of people a whole lot of frustration.
And sorry for the all the back and forth.
Just some final nitpicks from my side

Comment thread crates/krilla/src/configure/validate.rs
Comment thread crates/krilla/src/configure/validate.rs Outdated
Comment thread crates/krilla/src/configure/mod.rs Outdated
Comment thread crates/krilla/src/configure/mod.rs Outdated
reknih added 2 commits May 8, 2026 16:26
This strategy is inspired by LaurenzV#355 and its reviews, https://gist.github.com/saecki/c37ed6c6e07c07e79d87a93aca78ef07 and https://gist.github.com/reknih/c54e9400ecaed250ef14e4c633c62422.

It eschews a vector of `Validator` in favor of a struct with one validator per standard (Like LaurenzV#355). It also centralizes all policy decisions involving only validators in the `Validators` struct (like the unified flags approach) and decisions involving both the version and the validators on `SerializeSettings`.

In order to minimize API churn when new validators are added, `Configuration` is now created through a `ConfigurationBuilder`. New standards will still result in a breaking change, since the Archival and Ua enums rely on being exhaustive, but as long as the consumer does not actually match them, they will not have to update their uses of Krilla.

I consolidated the prohibition calculation in a tuple match to make it more compact and maintainable.

I also enabled PDF 2.0 files without no validator to use Associated Files and removed self-contained XMP schemata for extension schemata from PDF/A-4.
`embedded_file_pdf_20`: PDF 2.0 allows the `AF` and `AFRelationship` keys. Before, it was only written for PDF/A-3 files. Reference `SerializeSettings::supports_associated_files`.
`validate_pdf_a4_f_with_embedded_file`, `validate_pdf_a4_full_example`, `validate_pdf_a4e_full_example`, `validate_pdf_a4f_full_example`: PDF/A-4 does not actually define the `pdfaSchema`, `pdfaProperty`, and `pdfaType` properties. Instead, ISO 19005-4 clause 6.7.2.3 posits that the XMP metadata should be described using RELAX NG (ISO 16684-2). We don't support that, but since it's only a recommendation, it's fine.
`validate_multi_validator_embedded_file_no_af` has been changed and renamed in order to reflect the correct understanding that PDF/A-3 amends the file with the `/AF` and `/AFRelationship` keys, even if other standards are in use, too.
@reknih reknih force-pushed the multi-validators branch from dda08bb to d0bbfd3 Compare May 8, 2026 14:26
@reknih reknih enabled auto-merge (squash) May 8, 2026 14:28
@reknih reknih disabled auto-merge May 8, 2026 14:29
@reknih reknih enabled auto-merge (squash) May 8, 2026 14:30
@reknih reknih merged commit d56d20d into LaurenzV:main May 8, 2026
6 checks passed
@reknih reknih deleted the multi-validators branch May 8, 2026 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support multiple validators at once

3 participants