AI Peer Review: Free Skill Bridging Claude, Codex, Gemini

AI peer review for Claude Code is now a free open-source skill. It lets Claude, Codex, and Gemini review each other's code through CLI handoffs — with a structured verdict, file:line findings, and an honesty contract baked into the response format.

The skill is called `ai-collab-bridge`. MIT licensed. Installs in 30 seconds. I tested it on macOS arm64; the bash scripts are portable to Linux.

TL;DR

What it is: A free Claude Code skill that turns Claude, Codex, and Gemini into peer code reviewers for each other.

Install: One `git clone`, one `chmod +x`, one `doctor.sh` run.

How AI peer review works: Implementer stages a packet (summary + focus questions + diff), reviewer responds with a structured verdict (`APPROVE` / `CONCERNS` / `BLOCK`) and file:line findings.

Benchmark: 100% pass rate with the skill vs 31% without — same model, same prompts, three test cases I ran end-to-end.

Proof: The bridge reviewed itself three times and caught four real bugs I had missed.

Repo: github.com/owgit/ai-collab-bridge (MIT)

Install the AI peer review skill in 30 seconds

bash

git clone https://github.com/owgit/ai-collab-bridge ~/.claude/skills/ai-collab-bridge
chmod +x ~/.claude/skills/ai-collab-bridge/scripts/*.sh
~/.claude/skills/ai-collab-bridge/scripts/doctor.sh

The `doctor.sh` script verifies your skill files, required tooling, and each AI CLI (Claude, Codex, Gemini) in one pass. Exit code equals the number of failed checks. If anything is broken, the output tells you the exact fix command. In my testing on a fresh macOS install, the whole verification ran in under two seconds.

Then start a new Claude Code session — the AI peer review skill is auto-discovered — and trigger it with something like *"use the AI Collab Bridge to review my latest commit."*

How AI peer review actually works

AI peer review protocol flow between Claude and Codex — an implementer node stages a packet that travels across a glowing bridge to a reviewer node, which sends back a structured verdict and file:line findings

The protocol is small enough to fit in one paragraph. The implementer — whichever AI just finished work — stages a packet. The packet is a single markdown file with a summary, focus questions, the diff, and the file list. The implementer then invokes the reviewer's CLI with that packet wrapped in a review-request template. The reviewer reads the packet and responds.

Stage a packet (implementer side)

bash

SUMMARY="Brief description of the change" \
QUESTIONS="Anything you want focused on" \
~/.claude/skills/ai-collab-bridge/scripts/stage-packet.sh main > /tmp/packet.md

That produces a markdown file with everything the reviewer needs: your summary, your focus questions, the `git diff --stat`, and the full diff against `main`.

Dispatch the AI peer review

bash

~/.claude/skills/ai-collab-bridge/scripts/request-review.sh codex /tmp/packet.md
# or claude / gemini — same shape, different target

The script wraps your packet in the review-request template, sends it to the target AI's CLI, and prints the structured response.

The response format

Every AI peer review response follows the same five-section template:

A single-line verdict: `APPROVE`, `CONCERNS`, or `BLOCK`

Findings grouped by Bugs / Security / Quality / Suggestions

Every finding has a `file:line` reference and a clear reason

A "What I checked" list

A "What I did NOT check" list

The last item is the one that surprised me when I first designed it. In my experience, the most valuable section of any code review is the one where the reviewer is forced to disclose what they did not look at. Without it, you cannot tell whether an `APPROVE` means "I checked everything and it is fine" or "I skimmed the diff and nothing jumped out." With it, the implementer knows exactly where the reviewer's scope ends.

That is the honesty contract underneath every AI peer review. The protocol cannot enforce it. It can only make dishonesty harder to hide.

The benchmark: 100% vs 31% pass rate

I ran the AI peer review skill against vanilla Claude as baseline. Three test cases — reviewer role, implementer handoff, onboarding walkthrough — same prompts, same model, with-skill versus without-skill.

Metric	With skill	Without skill	Delta
---	---	---	---
Pass rate	100%	31%	+69 percentage points
Time	67 s	68 s	identical
Tokens	60 k	53 k	+14%

In my measurements, the AI peer review skill costs about 14% more tokens (the model reads the role playbooks) and runs at the same speed. Cheap insurance for structured, honest output.

The meta moment: the bridge reviewing itself

The AI peer review bridge reviewing its own code in an iterative feedback loop, infinity-shaped cyan and amber arrows symbolising three rounds of self-review that caught four real bugs

After I shipped v0.1, I used the bridge to review the bridge's own next commit. Codex caught two bugs I had missed in my doctor script and probe logic. I fixed both, shipped v0.2, and ran the bridge again. Codex caught two more bugs. I fixed those, shipped v0.3.

The four bugs the bridge found in itself

Not stylistic preferences, not fabricated concerns to look useful. Concrete things I tested after each fix:

The doctor script exited non-zero on single-CLI setups, contradicting its own README.

The pre-flight probe hardcoded `codex --version` instead of using the user's `AI_COLLAB_CODEX_CMD` override. Users with a working custom codex binary still got rejected.

The same probe naively extracted the first word of the override, which broke env-wrapped commands like `env PATH=... codex exec`.

The doctor printed a red `✗` for missing optional CLIs before downgrading the counter to a warning, so the visual contradicted the exit code.

None of those were obvious to me when I wrote them. All of them were obvious to Codex when it read the diff cold, without my implementation bias.

That is the entire pitch for AI peer review. Different minds, different blind spots, structured handoff. The work gets better and you do not have to think about it.

What I learned about the Codex CLI

Most of the work in shipping this AI peer review skill was not the protocol. It was figuring out how to invoke each AI's CLI without it doing something unexpected.

Codex MCP bootstrap hang

When you run `codex exec`, it loads `~/.codex/config.toml` and tries to bootstrap any MCP servers you have configured. If those servers require their own auth (Cloudflare Workers MCP, GitHub Copilot MCP, and so on) and you have not set up tokens for non-interactive use, codex stalls — silently. In my first review dispatch, the process hung for ten minutes before I traced it back to the MCP bootstrap.

The fix is `--ignore-user-config`. The codex CLI also has a dedicated `codex exec review` subcommand designed for exactly this use case, plus an `-o` flag that writes only the agent's final message to a file. See the OpenAI Codex CLI repo and the Codex documentation for the full reference.

The npm vendor binary ENOENT

`@openai/codex` installed via npm sometimes ships without its platform-specific vendor binary. `which codex` succeeds, but invoking it fails with a cryptic Node `ENOENT` stack trace pointing at a missing arm64 binary. I hit this on my own machine and the fix is `npm uninstall -g @openai/codex && npm install -g @openai/codex`. The bridge's `doctor.sh` now detects this exact pattern and surfaces the fix directly.

Supported AI CLIs

AI	Default invocation	Override env var
---	---	---
Claude	`claude -p`	`AI_COLLAB_CLAUDE_CMD`
Codex	`codex exec review --ignore-user-config --ephemeral`	`AI_COLLAB_CODEX_CMD`
Gemini	`gemini -p`	`AI_COLLAB_GEMINI_CMD`

Adding another AI is one `case` line in `scripts/request-review.sh` plus an optional role playbook under `references/`. PRs welcome.

Why AI peer review matters for multi-AI workflows

Try the AI peer review skill yourself

The AI peer review skill is open-source and lives at `github.com/owgit/ai-collab-bridge`.

bash

git clone https://github.com/owgit/ai-collab-bridge ~/.claude/skills/ai-collab-bridge
chmod +x ~/.claude/skills/ai-collab-bridge/scripts/*.sh
~/.claude/skills/ai-collab-bridge/scripts/doctor.sh

The point

The boundaries between AI models are conventions, not facts. We get to decide whether they cooperate or compete. If you have two AIs working on the same codebase and they are not reviewing each other, you are leaving real bugs in the code.

The AI peer review skill is one structured way to fix that. It is free, MIT-licensed, and installs in 30 seconds. Go give your AIs a colleague.

FAQ

What is the AI peer review skill?+

The AI peer review skill is a free open-source Claude Code skill called ai-collab-bridge that lets Claude, Codex, Gemini, and other AI CLIs review each other's code through a structured packet-and-response protocol. Verdicts are limited to APPROVE, CONCERNS, or BLOCK and every finding requires a file:line reference.

How do I install the AI peer review skill?+

Run git clone https://github.com/owgit/ai-collab-bridge into ~/.claude/skills/ai-collab-bridge, chmod +x the scripts directory, then run scripts/doctor.sh to verify your environment. Claude Code auto-discovers the skill on the next session start. The whole install takes about 30 seconds.

Does AI peer review work with Codex CLI?+

Yes. The skill ships with built-in support for Codex via the codex exec review subcommand and bypasses the MCP bootstrap hang with the --ignore-user-config flag. Codex CLI is also the AI that has caught the most bugs during the bridge's self-review iterations.

Is the AI peer review skill free and open source?+

Yes. The ai-collab-bridge skill is MIT licensed and free to use, fork, and contribute to. The repo is at github.com/owgit/ai-collab-bridge and includes AGENTS.md and CLAUDE.md entry points so other AI agents can discover and install it directly.

Can I use AI peer review with Gemini or other models?+

Yes. Gemini works out of the box with the gemini -p invocation. Adding another AI is one case line in scripts/request-review.sh plus an optional role playbook under references/. The protocol itself is open and any AI with a CLI can participate.

✻

Back to home

AI Peer Review: Free Skill Bridging Claude, Codex, Gemini

TL;DR

Install the AI peer review skill in 30 seconds

How AI peer review actually works

Stage a packet (implementer side)

Dispatch the AI peer review

The response format

The benchmark: 100% vs 31% pass rate

The meta moment: the bridge reviewing itself

The four bugs the bridge found in itself

What I learned about the Codex CLI

Codex MCP bootstrap hang

The npm vendor binary ENOENT

Supported AI CLIs

Why AI peer review matters for multi-AI workflows

Try the AI peer review skill yourself

The point

FAQ

Recommended for you

Cross-Repo AI Context: Make AI Understand Your Full System

Claude Opus 4.7, Vibecoding, and the Security Review You Should Be Running

Claude Code vs Cursor: Honest Developer Comparison for 2026