Skip to content

Check that features required by UA-1 are available in the current PDF version#278

Open
reknih wants to merge 1 commit intoLaurenzV:mainfrom
reknih:raise-ua1-requirement
Open

Check that features required by UA-1 are available in the current PDF version#278
reknih wants to merge 1 commit intoLaurenzV:mainfrom
reknih:raise-ua1-requirement

Conversation

@reknih
Copy link
Copy Markdown
Collaborator

@reknih reknih commented Oct 8, 2025

This PR checks that three features are available by the current PDF version in circumstances in which PDF/UA-1 requires them. If they are needed but unavailable, a validation error is raised.

An earlier implementation of this PR just forbid export to these lower versions entirely.

@LaurenzV
Copy link
Copy Markdown
Owner

LaurenzV commented Oct 8, 2025

Hmm but....

image

@LaurenzV
Copy link
Copy Markdown
Owner

LaurenzV commented Oct 8, 2025

I guess I'm fine if it makes sense, but then it seems weird to me why they would say it works with all versions.

@laurmaedje
Copy link
Copy Markdown
Collaborator

Does not block Typst 0.14.

@reknih
Copy link
Copy Markdown
Collaborator Author

reknih commented Oct 8, 2025

First, there is the /Suspects key:

Files claiming conformance with this International Standard shall have a Suspects value of false (ISO 32000-1:2008, Table 321).

This key was introduced in PDF 1.6. Of course, its default value is false, so to comply, it can be omitted. However, veraPDF interprets the rule as a mandatory presence of the key. We can decide that we interpret the spec differently and allow export from earlier versions without the key due to the default.

Structure elements of type TH should have a Scope attribute. If the table’s structure is not determinable via Headers and IDs, then structure elements of type TH shall have a Scope attribute.

The /Scope attribute was introduced in PDF 1.5. I would be unconfortable with just omitting it and counting that its structures can be deduced from headers. Hence, when the /Scope attribute is used in PDF/UA-1, IMO krilla should require PDF 1.5 or higher. Same for documents containing annotations: They should have the PDF 1.5 /Tabs key set.

Running headers and footers shall be identified as Pagination artifacts and shall be classified as either Header or Footer subtypes as per ISO 32000-1:2008, 14.8.2.2.2, Table 330.

This clause says that when headers or footers are present, the appropriate subtype must be used (introduced with PDF 1.7). If a document does not contain these, we can export it without issues. However, if it does (a page number is enough to trigger this in Typst), 1.7 it is.

Hence, I see the following alternatives:

  • Raise the minimum to PDF 1.7 (current implementation)
  • Introduce one new validation error for headers and footers in PDF 1.6 and below and use that as the minimum (to pass veraPDF)
  • Leave the minimum at PDF 1.4 and introduce one or multiple validation errors when using required features from later versions

@LaurenzV
Copy link
Copy Markdown
Owner

However, veraPDF interprets the rule as a mandatory presence of the key.

Are you sure? I just tried exporting a 1.4 document with Typst without the Suspects key, and it still validates with veraPDF.

@LaurenzV
Copy link
Copy Markdown
Owner

The /Scope attribute was introduced in PDF 1.5. I would be unconfortable with just omitting it and counting that its structures can be deduced from headers. Hence, when the /Scope attribute is used in PDF/UA-1, IMO krilla should require PDF 1.5 or higher. Same for documents containing annotations: They should have the PDF 1.5 /Tabs key set.

Hmm, unless there really are validators that trip up in this case, I don't really see why we should forbid this. If certain attributes are only permitted starting from a specific PDF version, then just omitting it seems fine to me. The PDF is still accessible after all, just with a little less information. So since the UA-1 specification explicitly mentions that versions prior to 1.7 are also permissible, I wouldn't consider this as an issue. Maybe worth asking for clarification in https://github.com/pdf-association/pdf-issues?

@reknih
Copy link
Copy Markdown
Collaborator Author

reknih commented Oct 15, 2025

Alright, I gave it a few checks:

  • I was wrong about the suspects key, it is okay to omit it. A PDF 1.4 file without it passes.

  • A document with a link in PDF 1.4 fails veraPDF. It violates rule 7.18.3-1.

  • A document with a header or footer does not fail veraPDF because it has no programmatic way to find it. PAC is more ready to apply heuristics and may fail, but I am too lazy to start up a Windows machine to check.

  • It is possible to create a document with a table with a scoped header cell that fails veraPDF check 7.5-1. For example, consider the following Typst document:

    #set document(title: "Playground")
    #table(
      columns: 3,
      table.header([Foo], [Bar], [Baz]),
      [1], [2], [3],
      pdf.header-cell(scope: "row")[bing], [5], [6],
      [7], [8], [9],
    )

    Compile this with typst c --features a11y-extras --pdf-standard 1.4,ua-1 file.typ and you'll get the failure.

I would like to redo and rename the PR to raise validation errors in these cases:

  • Writing a table header cell that is either outside of a THead or has /Scope (Row) in PDF 1.4
  • Writing a link/annotation in PDF 1.4
  • Writing a Pagination artifact with subtype Header or Footer in PDF 1.6 and below

That way, documents not using these features can still output compliant UA-1 files and we make sure not to write invalid files.

@reknih reknih force-pushed the raise-ua1-requirement branch from 09491e8 to 0209a10 Compare November 22, 2025 11:50
@reknih reknih force-pushed the raise-ua1-requirement branch from 0209a10 to 4263e43 Compare November 22, 2025 12:16
@reknih reknih changed the title Only allow PDF 1.7 for PDF/UA-1 Check that features required by UA-1 are available in the current PDF version Nov 22, 2025
@LaurenzV
Copy link
Copy Markdown
Owner

LaurenzV commented May 1, 2026

After thinking a bit more about this a bit more, I think the best thing to do is to simply only allow PDF/UA-1 with 1.7, even if in theory lower versions should be supported, as previously suggested. Any objections against this? @reknih @laurmaedje If not I'll open a PR for this.

@reknih
Copy link
Copy Markdown
Collaborator Author

reknih commented May 4, 2026

I'd be curious to know where the change of mind comes from: You have rightly asserted above that some files from older versions comply with UA-1. IMO raising a validation error only if the user requests a feature that is not possible with the Validator / PDF version tuple is elegant. Are you concerned about this behavior being unintuitive to users, difficult to document, or to maintain?

@saecki saecki self-requested a review May 8, 2026 13:50
@LaurenzV LaurenzV self-requested a review May 9, 2026 07:14
Copy link
Copy Markdown
Owner

@LaurenzV LaurenzV left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, with comments addressed (and rebased onto newest main).

assert_eq!(
document.finish(),
Err(KrillaError::Validation(vec![
ValidationError::RequiresLaterPdfVersion(
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe RequriesNewerPdfVersion would be a bit better?

document.finish(),
Err(KrillaError::Validation(vec![
ValidationError::RequiresLaterPdfVersion(
PdfFeature::HeaderFooterArtifactSubtypes,
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe VersionedFeature? PdfFeature seems a bit too generic for me.

ValidationError::MissingTagging => *self == Validator::A1_A,
ValidationError::MissingDocumentDate => true,
ValidationError::EmbeddedPDF(_) => true,
ValidationError::RequiresLaterPdfVersion(_, _) => false,
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should explicitly spell out the three variants in all of these branches, so in case there is a new variant which requires validation in other standards, we don't overlook it.

Comment on lines +621 to +626

// For tables to be valid in PDF versions before 1.5, we need to ensure that
// each table header cell (`TH`) is inside a table header (`THead`).
// Otherwise, the unsupported `Scope` attribute would be required to express
// the correct semantics.
fn validate_table_structure_before_pdf15(&self, in_header: bool, sc: &mut SerializeContext) {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... Can we remove this for now and add a TODO instead? I do agree it would be nice to have tag tree validation, but I want to do it properly so that we can validate other constraints as well (for example the oens outlined in WCAG). Would like to avoid adding a method just for this specific case now in this PR.

Comment on lines +770 to +773
if matches!(self.tag, TagKind::Table(_)) && pdf_version < PdfVersion::Pdf15 {
for child in &self.children {
child.validate_table_structure_before_pdf15(false, sc);
}
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

Comment thread crates/krilla/src/page.rs
Comment on lines +428 to +436
if !self.annotations.is_empty() && sc.serialize_settings().enable_tagging {
if sc.serialize_settings().pdf_version() >= PdfVersion::Pdf15 {
page.tab_order(TabOrder::StructureOrder);
} else {
sc.register_validation_error(ValidationError::RequiresLaterPdfVersion(
PdfFeature::StructureOrderTabbing,
sc.location,
));
}
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I understanding correctly that this will fail any PDF 1.4 export with tagging enabled as soon as there is at least one annotation. Since the above comment suggests it's only needed for PDF/UA, let's only raise this error for that mode? Adding a simple require_structure_order_tabbing method to Validators should be enough. Let's also add a PDF 1.4 test with tagging and an annotation just to ensure that export succeeds (not a snapshot test, just check that finish.is_ok) so that this regression is covered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants