fix: handle NUL byte in output_dataclip on step:complete#4910
Open
mvanhorn wants to merge 1 commit into
Open
Conversation
Detect an unstorable NUL byte (0x00) in the decoded output_dataclip body
before inserting the Dataclip and return an {:error, changeset} instead of
letting Repo.insert raise an unhandled Postgrex.Error (22P05). PostgreSQL
cannot store 0x00 in text/jsonb, so the prior code crashed the whole
step:complete reply. Detecting before insert avoids poisoning the
surrounding Repo.transact transaction.
Closes OpenFn#4893
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR fixes a crash on
step:complete. When a worker reports anoutput_dataclipwhose decoded JSON contains a NUL byte (0x00),Lightning.Runs.Handlers.CompleteStepraised an unhandledPostgrex.Error(code
22P05,untranslatable_character) while inserting theDataclip,because PostgreSQL cannot store
0x00intext/jsonb. The raise propagatedout of
maybe_save_dataclip/2and failed the whole reply, leaving the step/runwithout a clean completion (Sentry LIGHTNING-13C).
The fix detects the unstorable NUL byte before the insert in the third
maybe_save_dataclip/2clause and returns{:error, changeset}instead ofletting
Repo.insert/1raise. Detecting before the insert (rather than rescuingthe
Postgrex.Error) keeps the surroundingRepo.transact/1transaction frombeing poisoned, and
CompleteStep.call/2's existingwithalready propagatesthe error tuple to the caller. Worker-side handling of the error response lives
in a separate repo and is out of scope.
Two details worth calling out:
Dataclip.new/1strips the unpersisted top-level
configurationkey viaremove_configuration/1, so the guard readsEcto.Changeset.get_field(changeset, :body)after that stripping. A NUL bytethat lives only under
configurationnever reaches PostgreSQL and is allowedthrough.
LightningWeb.ChannelHelpers.reply_with/2sendsinspect(error)to Sentry,so a changeset carrying
changes.bodywould leak the decoded job output intoSentry. The error changeset carries only the
:bodyerror message.Closes #4893
Validation steps
mix test test/lightning/runs_test.exs- three new tests in thedescribe "complete_step/2"block cover: a NUL byte at the top level returns{:error, %Ecto.Changeset{}}(no raise), a NUL byte nested in an arrayelement is also caught, and a NUL byte present only under top-level
configurationstill completes with{:ok, _step}(since that key is notpersisted).
complete_step/2happy-path tests still pass unchanged (normaloutput dataclips insert and complete as before).
Additional notes for the reviewer
contains_null_byte?/1) walks decoded JSON maps, lists,and binaries, checking both keys and values, so it catches NUL bytes at any
depth.
than mutate user data). The worker not yet respecting an error response on
step:completeis tracked separately.AI Usage
Please disclose whether you've used AI anywhere in this PR (it's cool, we just
want to know!):
You can read more details in our
Responsible AI Policy
Pre-submission checklist
/reviewwith Claude Code)
(e.g.,
:owner,:admin,:editor,:viewer) - n/a, this is aninternal data-persistence guard with no authorization surface