Skip to content

gh-aw treats a comment-based review verdict as successful even when the agent only reported tool failures #24756

@samuelkahessay

Description

@samuelkahessay

gh-aw version: v0.66.1
Discovered: 2026-04-05
Category: Agent output integrity / workflow reliability
Severity: High

What happens

In a private same-repo PR review run observed on 2026-04-05, the Pipeline Review Agent workflow hit a fatal combination of read-path failures and still posted a normal-looking REQUEST_CHANGES verdict comment:

  • GitHub MCP server crashed on the first pull_request_read call
  • gh CLI had no GH_TOKEN / GITHUB_TOKEN
  • direct GitHub API fallback returned 404 against the private repo

The resulting PR comment was not a code review. It was a description of the tool failure state. But because it still matched the workflow's required [PIPELINE-VERDICT] shape, gh-aw treated it as a successful safe output, the run concluded successfully, and the PR showed:

review: pass — Review completed: REQUEST_CHANGES

From the PR UI, that is indistinguishable from a legitimate review objection. A human or automation consuming only the check result sees "review completed" and assumes the agent read the diff. It did not.

This is a mixed failure:

  1. Our prompt contract requires a verdict comment and does not define a separate framework-failure channel.
  2. gh-aw considers "agent exited 0 and emitted at least one safe output" a success, even when that safe output is just an ordinary comment carrying a tooling-failure narrative.

What should happen

Comment-based workflows need a first-class way to say "the review could not be performed" without overloading the normal verdict channel.

At minimum:

  1. gh-aw should expose a structured diagnostic/failure path that agents can use when required tools or reads are unavailable.
  2. Failure handling should be able to act on that state even when the agent job exited successfully and emitted a comment.
  3. The docs should show comment-driven workflows how to separate review verdicts from infrastructure-failure diagnostics.

Without that distinction, any workflow that machine-parses a marker like [PIPELINE-VERDICT] is one prompt edit away from treating an infrastructure outage as a real code review.

Where in the code

  • Repo-local trigger: aurrin-platform/.github/workflows/pr-review-agent.md requires a [PIPELINE-VERDICT] comment and instructs the agent to read the PR through gh (lines 47-79, 114-167 in the current local file).
  • Upstream actions/setup/js/safe_output_handler_manager.cjs:252-289 collects message metadata, and actions/setup/js/safe_output_handler_manager.cjs:314-318 / 737-744 still return success: true as long as the safe-output handlers themselves succeeded.
  • Upstream actions/setup/js/safe_output_handler_manager.cjs:1126-1138 only fails the run for handler-processing failures or missing handlers, not for semantically bad-but-well-formed comments.
  • Upstream actions/setup/js/handle_agent_failure.cjs:880-929 only escalates when the agent failed, timed out, or produced no safe outputs. If the agent succeeded and emitted any safe output, failure handling is skipped.

Evidence

Observed run

  • Private same-repo pull_request review run on 2026-04-05
  • PR checks row: review: pass — Review completed: REQUEST_CHANGES
  • Public links omitted here because the reproduction repo and run are private; the exact verdict body is reproduced below.

Exact comment body posted as the "verdict"

[PIPELINE-VERDICT]
## Pipeline Review

**VERDICT: REQUEST_CHANGES**

### Summary
The automated review could not complete: the GitHub MCP server experienced a
WASM runtime crash, `gh` CLI is not authenticated in this environment, and the
repository is private (GitHub REST API returns 404 without a token). The diff
and issue details for PR #263 could not be read.

That comment is plainly a tooling failure report, not a review of the code. Nonetheless the workflow completed successfully and the PR received a normal-looking review check result.

For contrast, neighboring PRs in the same batch produced real verdict comments that referenced concrete file paths, criteria, and code-level findings. This observed verdict was uniquely devoid of code-review substance while remaining syntactically valid.

Proposed fix

Minimal upstream path:

  1. Add a first-class "review incomplete / tooling failure" safe output or equivalent structured signal, distinct from ordinary comments.
  2. Thread that signal into handle_agent_failure.cjs so workflows can fail or end in a neutral/skipped state even when the agent exit code is 0.
  3. Document the pattern for comment-based workflows that hand off on a marker like [PIPELINE-VERDICT]: normal verdict comments should be reserved for actual review conclusions, not infrastructure diagnostics.

This does not require gh-aw to parse prose. It only needs a supported way to say "review could not be performed" so comment-shaped diagnostics do not get classified as completed reviews.

Impact

High. A false APPROVE from this path would look like a completed review on code the agent never read. The observed case was a false REQUEST_CHANGES, but the classification bug is symmetric: any syntactically valid verdict comment is accepted whether or not a real review happened.

This is dangerous for workflows that machine-parse review comments. Today there is no structured middle state between "agent failed outright" and "agent posted a normal safe output."

Related upstream issues

  • #18992 (open) — asks for a structured metadata channel. Related, but it is about metadata surviving sanitization, not about a false review verdict being accepted as success.
  • #20035 (closed) — handler-level failures were not escalated. Related failure-family, but here the handler succeeded and the content was wrong.
  • #21501 (closed) — targeted dispatch was ignored and a noop was accepted. Closest structural sibling: the workflow succeeded even though the agent never read the intended input, but the output there was noop, not a false review verdict.

The narrow difference here is: the safe output was posted successfully, looked valid, and still should not have counted as a completed review.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions