gh-aw treats a comment-based review verdict as successful even when the agent only reported tool failures

**gh-aw version:** v0.66.1
**Discovered:** 2026-04-05
**Category:** Agent output integrity / workflow reliability
**Severity:** High

## What happens

In a private same-repo PR review run observed on 2026-04-05, the Pipeline Review Agent workflow hit a fatal combination of read-path failures and still posted a normal-looking `REQUEST_CHANGES` verdict comment:

- GitHub MCP server crashed on the first `pull_request_read` call
- `gh` CLI had no `GH_TOKEN` / `GITHUB_TOKEN`
- direct GitHub API fallback returned 404 against the private repo

The resulting PR comment was not a code review. It was a description of the tool failure state. But because it still matched the workflow's required `[PIPELINE-VERDICT]` shape, gh-aw treated it as a successful safe output, the run concluded successfully, and the PR showed:

`review: pass — Review completed: REQUEST_CHANGES`

From the PR UI, that is indistinguishable from a legitimate review objection. A human or automation consuming only the check result sees "review completed" and assumes the agent read the diff. It did not.

This is a mixed failure:

1. Our prompt contract requires a verdict comment and does not define a separate framework-failure channel.
2. gh-aw considers "agent exited 0 and emitted at least one safe output" a success, even when that safe output is just an ordinary comment carrying a tooling-failure narrative.

## What should happen

Comment-based workflows need a first-class way to say "the review could not be performed" without overloading the normal verdict channel.

At minimum:

1. gh-aw should expose a structured diagnostic/failure path that agents can use when required tools or reads are unavailable.
2. Failure handling should be able to act on that state even when the agent job exited successfully and emitted a comment.
3. The docs should show comment-driven workflows how to separate review verdicts from infrastructure-failure diagnostics.

Without that distinction, any workflow that machine-parses a marker like `[PIPELINE-VERDICT]` is one prompt edit away from treating an infrastructure outage as a real code review.

## Where in the code

- Repo-local trigger: `aurrin-platform/.github/workflows/pr-review-agent.md` requires a `[PIPELINE-VERDICT]` comment and instructs the agent to read the PR through `gh` (`lines 47-79`, `114-167` in the current local file).
- Upstream `actions/setup/js/safe_output_handler_manager.cjs:252-289` collects message metadata, and `actions/setup/js/safe_output_handler_manager.cjs:314-318` / `737-744` still return `success: true` as long as the safe-output handlers themselves succeeded.
- Upstream `actions/setup/js/safe_output_handler_manager.cjs:1126-1138` only fails the run for handler-processing failures or missing handlers, not for semantically bad-but-well-formed comments.
- Upstream `actions/setup/js/handle_agent_failure.cjs:880-929` only escalates when the agent failed, timed out, or produced no safe outputs. If the agent succeeded and emitted any safe output, failure handling is skipped.

## Evidence

**Observed run**

- Private same-repo `pull_request` review run on 2026-04-05
- PR checks row: `review: pass — Review completed: REQUEST_CHANGES`
- Public links omitted here because the reproduction repo and run are private; the exact verdict body is reproduced below.

**Exact comment body posted as the "verdict"**

```
[PIPELINE-VERDICT]
## Pipeline Review

**VERDICT: REQUEST_CHANGES**

### Summary
The automated review could not complete: the GitHub MCP server experienced a
WASM runtime crash, `gh` CLI is not authenticated in this environment, and the
repository is private (GitHub REST API returns 404 without a token). The diff
and issue details for PR #263 could not be read.
```

That comment is plainly a tooling failure report, not a review of the code. Nonetheless the workflow completed successfully and the PR received a normal-looking review check result.

For contrast, neighboring PRs in the same batch produced real verdict comments that referenced concrete file paths, criteria, and code-level findings. This observed verdict was uniquely devoid of code-review substance while remaining syntactically valid.

## Proposed fix

Minimal upstream path:

1. Add a first-class "review incomplete / tooling failure" safe output or equivalent structured signal, distinct from ordinary comments.
2. Thread that signal into `handle_agent_failure.cjs` so workflows can fail or end in a neutral/skipped state even when the agent exit code is `0`.
3. Document the pattern for comment-based workflows that hand off on a marker like `[PIPELINE-VERDICT]`: normal verdict comments should be reserved for actual review conclusions, not infrastructure diagnostics.

This does not require gh-aw to parse prose. It only needs a supported way to say "review could not be performed" so comment-shaped diagnostics do not get classified as completed reviews.

## Impact

High. A false `APPROVE` from this path would look like a completed review on code the agent never read. The observed case was a false `REQUEST_CHANGES`, but the classification bug is symmetric: any syntactically valid verdict comment is accepted whether or not a real review happened.

This is dangerous for workflows that machine-parse review comments. Today there is no structured middle state between "agent failed outright" and "agent posted a normal safe output."

## Related upstream issues

- [#18992](https://github.com/github/gh-aw/issues/18992) (open) — asks for a structured metadata channel. Related, but it is about metadata surviving sanitization, not about a false review verdict being accepted as success.
- [#20035](https://github.com/github/gh-aw/issues/20035) (closed) — handler-level failures were not escalated. Related failure-family, but here the handler succeeded and the content was wrong.
- [#21501](https://github.com/github/gh-aw/issues/21501) (closed) — targeted dispatch was ignored and a `noop` was accepted. Closest structural sibling: the workflow succeeded even though the agent never read the intended input, but the output there was `noop`, not a false review verdict.

The narrow difference here is: the safe output was posted successfully, looked valid, and still should not have counted as a completed review.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-aw treats a comment-based review verdict as successful even when the agent only reported tool failures #24756

What happens

What should happen

Where in the code

Evidence

Proposed fix

Impact

Related upstream issues

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

gh-aw treats a comment-based review verdict as successful even when the agent only reported tool failures #24756

Description

What happens

What should happen

Where in the code

Evidence

Proposed fix

Impact

Related upstream issues

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions