AI Peer Review: Free Skill Bridging Claude, Codex, Gemini
Tech
AI
Claude Code
Codex
Dev Tools

AI Peer Review: Free Skill Bridging Claude, Codex, Gemini

A free open-source Claude Code skill that lets Claude, Codex, and Gemini review each other's code via CLI. Install in 30 seconds.

Uygar DuzgunUUygar Duzgun
May 13, 2026
9 min read

AI peer review for Claude Code is now a free open-source skill. It lets Claude, Codex, and Gemini review each other's code through CLI handoffs — with a structured verdict, file:line findings, and an honesty contract baked into the response format.

The skill is called `ai-collab-bridge`. MIT licensed. Installs in 30 seconds. I tested it on macOS arm64; the bash scripts are portable to Linux.

TL;DR

What it is: A free Claude Code skill that turns Claude, Codex, and Gemini into peer code reviewers for each other.
Install: One `git clone`, one `chmod +x`, one `doctor.sh` run.
How AI peer review works: Implementer stages a packet (summary + focus questions + diff), reviewer responds with a structured verdict (`APPROVE` / `CONCERNS` / `BLOCK`) and file:line findings.
Benchmark: 100% pass rate with the skill vs 31% without — same model, same prompts, three test cases I ran end-to-end.
Proof: The bridge reviewed itself three times and caught four real bugs I had missed.

Install the AI peer review skill in 30 seconds

bash
git clone https://github.com/owgit/ai-collab-bridge ~/.claude/skills/ai-collab-bridge
chmod +x ~/.claude/skills/ai-collab-bridge/scripts/*.sh
~/.claude/skills/ai-collab-bridge/scripts/doctor.sh

The `doctor.sh` script verifies your skill files, required tooling, and each AI CLI (Claude, Codex, Gemini) in one pass. Exit code equals the number of failed checks. If anything is broken, the output tells you the exact fix command. In my testing on a fresh macOS install, the whole verification ran in under two seconds.

Then start a new Claude Code session — the AI peer review skill is auto-discovered — and trigger it with something like *"use the AI Collab Bridge to review my latest commit."*

How AI peer review actually works

AI peer review protocol flow between Claude and Codex — an implementer node stages a packet that travels across a glowing bridge to a reviewer node, which sends back a structured verdict and file:line findings
AI peer review protocol flow between Claude and Codex — an implementer node stages a packet that travels across a glowing bridge to a reviewer node, which sends back a structured verdict and file:line findings

The protocol is small enough to fit in one paragraph. The implementer — whichever AI just finished work — stages a packet. The packet is a single markdown file with a summary, focus questions, the diff, and the file list. The implementer then invokes the reviewer's CLI with that packet wrapped in a review-request template. The reviewer reads the packet and responds.

Stage a packet (implementer side)

bash
SUMMARY="Brief description of the change" \
QUESTIONS="Anything you want focused on" \
~/.claude/skills/ai-collab-bridge/scripts/stage-packet.sh main > /tmp/packet.md

That produces a markdown file with everything the reviewer needs: your summary, your focus questions, the `git diff --stat`, and the full diff against `main`.

Dispatch the AI peer review

bash
~/.claude/skills/ai-collab-bridge/scripts/request-review.sh codex /tmp/packet.md
# or claude / gemini — same shape, different target

The script wraps your packet in the review-request template, sends it to the target AI's CLI, and prints the structured response.

The response format

Every AI peer review response follows the same five-section template:

A single-line verdict: `APPROVE`, `CONCERNS`, or `BLOCK`
Findings grouped by Bugs / Security / Quality / Suggestions
Every finding has a `file:line` reference and a clear reason
A "What I checked" list
A "What I did NOT check" list

The last item is the one that surprised me when I first designed it. In my experience, the most valuable section of any code review is the one where the reviewer is forced to disclose what they did not look at. Without it, you cannot tell whether an `APPROVE` means "I checked everything and it is fine" or "I skimmed the diff and nothing jumped out." With it, the implementer knows exactly where the reviewer's scope ends.

That is the honesty contract underneath every AI peer review. The protocol cannot enforce it. It can only make dishonesty harder to hide.

The benchmark: 100% vs 31% pass rate

I ran the AI peer review skill against vanilla Claude as baseline. Three test cases — reviewer role, implementer handoff, onboarding walkthrough — same prompts, same model, with-skill versus without-skill.

MetricWith skillWithout skillDelta
------------
Pass rate100%31%+69 percentage points
Time67 s68 sidentical
Tokens60 k53 k+14%

In my measurements, the AI peer review skill costs about 14% more tokens (the model reads the role playbooks) and runs at the same speed. Cheap insurance for structured, honest output.

The meta moment: the bridge reviewing itself

The AI peer review bridge reviewing its own code in an iterative feedback loop, infinity-shaped cyan and amber arrows symbolising three rounds of self-review that caught four real bugs
The AI peer review bridge reviewing its own code in an iterative feedback loop, infinity-shaped cyan and amber arrows symbolising three rounds of self-review that caught four real bugs

After I shipped v0.1, I used the bridge to review the bridge's own next commit. Codex caught two bugs I had missed in my doctor script and probe logic. I fixed both, shipped v0.2, and ran the bridge again. Codex caught two more bugs. I fixed those, shipped v0.3.

The four bugs the bridge found in itself

Not stylistic preferences, not fabricated concerns to look useful. Concrete things I tested after each fix:

The doctor script exited non-zero on single-CLI setups, contradicting its own README.
The pre-flight probe hardcoded `codex --version` instead of using the user's `AI_COLLAB_CODEX_CMD` override. Users with a working custom codex binary still got rejected.
The same probe naively extracted the first word of the override, which broke env-wrapped commands like `env PATH=... codex exec`.
The doctor printed a red `✗` for missing optional CLIs before downgrading the counter to a warning, so the visual contradicted the exit code.

None of those were obvious to me when I wrote them. All of them were obvious to Codex when it read the diff cold, without my implementation bias.

That is the entire pitch for AI peer review. Different minds, different blind spots, structured handoff. The work gets better and you do not have to think about it.

What I learned about the Codex CLI

Most of the work in shipping this AI peer review skill was not the protocol. It was figuring out how to invoke each AI's CLI without it doing something unexpected.

Codex MCP bootstrap hang

When you run `codex exec`, it loads `~/.codex/config.toml` and tries to bootstrap any MCP servers you have configured. If those servers require their own auth (Cloudflare Workers MCP, GitHub Copilot MCP, and so on) and you have not set up tokens for non-interactive use, codex stalls — silently. In my first review dispatch, the process hung for ten minutes before I traced it back to the MCP bootstrap.

The fix is `--ignore-user-config`. The codex CLI also has a dedicated `codex exec review` subcommand designed for exactly this use case, plus an `-o` flag that writes only the agent's final message to a file. See the OpenAI Codex CLI repo and the Codex documentation for the full reference.

The npm vendor binary ENOENT

`@openai/codex` installed via npm sometimes ships without its platform-specific vendor binary. `which codex` succeeds, but invoking it fails with a cryptic Node `ENOENT` stack trace pointing at a missing arm64 binary. I hit this on my own machine and the fix is `npm uninstall -g @openai/codex && npm install -g @openai/codex`. The bridge's `doctor.sh` now detects this exact pattern and surfaces the fix directly.

Supported AI CLIs

AIDefault invocationOverride env var
---------
Claude`claude -p``AI_COLLAB_CLAUDE_CMD`
Codex`codex exec review --ignore-user-config --ephemeral``AI_COLLAB_CODEX_CMD`
Gemini`gemini -p``AI_COLLAB_GEMINI_CMD`

Adding another AI is one `case` line in `scripts/request-review.sh` plus an optional role playbook under `references/`. PRs welcome.

Why AI peer review matters for multi-AI workflows

Recommended reading

If you have been following the move from "AI as autocomplete" to "AI as collaborator," you have probably already noticed that the next bottleneck is not the individual models. The next bottleneck is how the models work with each other and with your tools. That is also what I argued in my agent-ready website checklist — the systems around the model matter more than the model itself.

In my own day-to-day, this looks like:

Claude Code drives feature work in one terminal
Codex CLI handles ops scripts and CI in another
Gemini does cross-checks on architecture or copy
A handful of MCP servers expose my CRM, blog, and design tools

Without AI peer review, those workflows run in parallel but in isolation. Each AI sees only what is in front of it. With the AI peer review skill in place, the diffs flow between them in a structured way, and the work compounds.

Recommended reading

The same pattern shows up across angles I have written about before — from the comparison side in Claude Code vs Cursor for 2026, from the tooling side in how to build an MCP server with TypeScript, and from the structure side in the cross-repo AI context layer. The agents in your loop need clean interfaces with each other, not just clean interfaces with humans.

Try the AI peer review skill yourself

The AI peer review skill is open-source and lives at `github.com/owgit/ai-collab-bridge`.

bash
git clone https://github.com/owgit/ai-collab-bridge ~/.claude/skills/ai-collab-bridge
chmod +x ~/.claude/skills/ai-collab-bridge/scripts/*.sh
~/.claude/skills/ai-collab-bridge/scripts/doctor.sh
Recommended reading

There are `AGENTS.md` and `CLAUDE.md` files at the repo root, so Codex and Claude Code both have explicit entry points if you point them at the repo cold. The protocol works for any AI with a CLI — same packet format, same response template, same three verdicts. In my workflow, this pairs naturally with the security review you should be running on your AI-generated code.

If you want to contribute, the next obvious steps are documented in the repo: enforcing the response format via `--output-schema`, adding support for more CLIs, and building reviewer-as-Claude scenarios beyond the reviewer-as-Codex paths I have already tested.

The point

The boundaries between AI models are conventions, not facts. We get to decide whether they cooperate or compete. If you have two AIs working on the same codebase and they are not reviewing each other, you are leaving real bugs in the code.

The AI peer review skill is one structured way to fix that. It is free, MIT-licensed, and installs in 30 seconds. Go give your AIs a colleague.

FAQ

What is the AI peer review skill?+
The AI peer review skill is a free open-source Claude Code skill called ai-collab-bridge that lets Claude, Codex, Gemini, and other AI CLIs review each other's code through a structured packet-and-response protocol. Verdicts are limited to APPROVE, CONCERNS, or BLOCK and every finding requires a file:line reference.
How do I install the AI peer review skill?+
Run git clone https://github.com/owgit/ai-collab-bridge into ~/.claude/skills/ai-collab-bridge, chmod +x the scripts directory, then run scripts/doctor.sh to verify your environment. Claude Code auto-discovers the skill on the next session start. The whole install takes about 30 seconds.
Does AI peer review work with Codex CLI?+
Yes. The skill ships with built-in support for Codex via the codex exec review subcommand and bypasses the MCP bootstrap hang with the --ignore-user-config flag. Codex CLI is also the AI that has caught the most bugs during the bridge's self-review iterations.
Is the AI peer review skill free and open source?+
Yes. The ai-collab-bridge skill is MIT licensed and free to use, fork, and contribute to. The repo is at github.com/owgit/ai-collab-bridge and includes AGENTS.md and CLAUDE.md entry points so other AI agents can discover and install it directly.
Can I use AI peer review with Gemini or other models?+
Yes. Gemini works out of the box with the gemini -p invocation. Adding another AI is one case line in scripts/request-review.sh plus an optional role playbook under references/. The protocol itself is open and any AI with a CLI can participate.

Recommended for you

Cross-Repo AI Context: Make AI Understand Your Full System

Cross-Repo AI Context: Make AI Understand Your Full System

Cross-repo AI context helps your AI IDE understand multiple projects, repos, folders, integrations, and environments as one working system.

8 min read
Claude Opus 4.7, Vibecoding, and the Security Review You Should Be Running

Claude Opus 4.7, Vibecoding, and the Security Review You Should Be Running

Opus 4.7 raised the ceiling on AI coding — it also raised the stakes on what ships to production. Here's what's actually new, the API breaking changes nobody warned you about, the new tabs/sessions in Claude Desktop, and the two-pass security review I run on every vibecoded feature.

11 min read
Claude Code vs Cursor: Honest Developer Comparison for 2026

Claude Code vs Cursor: Honest Developer Comparison for 2026

I compared Claude Code, Cursor, and GitHub Copilot in real workflows. Here's what actually saves time in 2026.

10 min read