Skip to content

team mode: multi-agent adversarial PR review#6

Merged
pierrick-fonquerne merged 8 commits into
mainfrom
feature/ai-review-team-mode
Jun 3, 2026
Merged

team mode: multi-agent adversarial PR review#6
pierrick-fonquerne merged 8 commits into
mainfrom
feature/ai-review-team-mode

Conversation

@pierrick-fonquerne

Copy link
Copy Markdown
Contributor

@

Summary

Adds a seventh ai-review mode, team, that orchestrates several Mistral agents to produce a higher-confidence review than any single pass:

  1. Fan-out: four specialist agents (correctness, security, architecture, performance) review the diff in parallel, with graceful degradation if one fails.
  2. Synthesis: a mistral-large step merges and deduplicates their findings, attributes each to its sources, drops noise already enforced by Clippy or the compiler, anchors each message to a concrete code element, tags a category, and surfaces agent disagreement.
  3. Adversarial verification: every retained finding is checked by a three-lens vote (does the code confirm it, is the impact real, is it a known false positive), each lens reading the patch of the file it concerns. A sceptical majority decides whether the finding is contested.
  4. Deterministic verdict: computed in Rust from confirmed vs contested criticals (NEEDS_WORK / DISCUSS / SHIP).
  5. Output: a single upserted comment listing confirmed and contested findings with a transparency footer, plus inline comments for confirmed criticals only.

Design notes

  • Verification is concurrency-bounded (Arc<Semaphore> + JoinSet, acquire_owned permits) and capped at the top 15 findings by severity, with the cap reported.
  • LLM responses are parsed with tolerant serde (default collections, loose boolean coercion, category fallback) so minor formatting drift never fails a whole pass.
  • The reusable workflow already forwards mode, so team needs no workflow change here. Trigger it from a caller workflow with the ai:team label (documented in the README). Rolling the trigger out to the consuming repos is a follow-up.

Verification

  • cargo fmt --all -- --check
  • cargo clippy --workspace --all-targets --all-features -- -D warnings
  • cargo test --workspace --all-features (19 tests, including pure unit tests for the vote aggregation, the deterministic verdict, comment rendering, and tolerant deserialization)
  • cargo build --release
    @

Pierrick Fonquerne added 8 commits June 3, 2026 16:04
Add Agent, Lens, SynthFinding, SynthReport, LensVerdict, FindingVerdict
and Verdict types for the multi-agent team review mode. Responses from
language models are parsed with tolerant serde (default collections,
loose boolean coercion) to survive minor formatting drift.
Add DiffContext (full diff plus per-file patch map) and fetch_diff_context;
fetch_diff now delegates to it for zero behavioural change. patch_for selects
the per-file patch for a finding, falling back to a UTF-8-safe truncation of
the full diff for file-level or cross-file findings.
Add call_agent (specialist fan-out), call_synthesis (deduplicating merge of
the agent reports into a SynthReport) and call_lens (adversarial per-finding
verification). Introduce the synthesis and three lens system prompts plus the
team model constants.
Strengthen the synthesis prompt: drop noise already enforced by Clippy or the
compiler, require each message to name the concrete code element, and surface
agent disagreement instead of silently resolving it. Add a Category enum
(bug/security/design/performance/test-gap, with a tolerant Other fallback)
carried on every SynthFinding.
Add the team module: run_team runs the four specialist agents concurrently
(graceful degradation), synthesises and caps findings, then verifies each with
a concurrency-bounded 3-lens adversarial vote, computes the verdict
deterministically, renders the comment and posts confirmed criticals inline.
Add render_team_comment and TeamCommentView. The pure aggregate_lens_votes and
compute_verdict helpers are unit-tested.
Wire Mode::Team (env "team", comment marker, label) and dispatch run_team
from main. Surface the lens name in verification warnings, and drop the
temporary dead_code allowances now that every team-mode item is reachable.
List the seven review modes and their models, and show how a caller workflow
invokes the reusable workflow, including the on-demand team mode gated by the
ai:team label.
@pierrick-fonquerne pierrick-fonquerne merged commit 0c839b4 into main Jun 3, 2026
3 checks passed
@pierrick-fonquerne pierrick-fonquerne deleted the feature/ai-review-team-mode branch June 3, 2026 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant