Skip to main content

Command Palette

Search for a command to run...

Codex GitHub Connector: How to Set Up Automated PR Reviews That Follow Your Team's Rules

How to set up the Codex GitHub App for automated PR reviews, auto-fix, and AGENTS.md-driven code standards. The setup top engineering teams use in 2026.

Updated
โ€ข12 min read
Codex GitHub Connector: How to Set Up Automated PR Reviews That Follow Your Team's Rules
D
PhD in Computational Linguistics. I build the operating systems for responsible AI. Founder of First AI Movers, helping companies move from "experimentation" to "governance and scale." Writing about the intersection of code, policy (EU AI Act), and automation.

The Codex GitHub App turns OpenAI's Codex into a native GitHub teammate that reviews every pull request, posts inline comments on real issues, and can auto-fix problems by pushing directly to the branch. This is not a generic AI linter that flags nitpicks. Configured correctly with an AGENTS.md file that codifies your team's standards, Codex reviews like a senior engineer who never sleeps, never gets distracted, and always enforces the rules you wrote down. OpenAI uses this internally to review 100% of their own PRs before merge.

If your team is drowning in PR review backlog, and with AI agents generating 50+ PRs per day, most teams are, this is the single highest-leverage setup you can do this week.


How the Connector Works

Setup (5 Minutes)

  1. In ChatGPT or Codex, go to Settings โ†’ Connectors โ†’ GitHub
  2. Authorise the Codex GitHub App for your personal account or organisation
  3. Codex clones your repo into its cloud sandbox (reads diffs, runs commands, analyses context)
  4. In Codex settings, go to the Code review tab:
    • Toggle Code review on for specific repos
    • Toggle Automatic reviews if you want every new PR reviewed without manual triggers

That is it. Every new PR now gets a Codex review.

Three Trigger Modes

Trigger How When to Use
Automatic Every PR gets reviewed, no action needed Default for active repos
Manual Comment @codex review on any PR โ†’ it reacts with ๐Ÿ‘€ and posts a full review When you want reviews on demand
Auto-fix Reply @codex fix the P1 issue or @codex fix it โ†’ Codex pushes the fix directly to the branch When you want the reviewer to also be the fixer

Codex posts reviews exactly like a human teammate: inline comments on specific lines plus a summary comment at the top.


What Makes This Different from Generic AI Linting

High-Signal Only

Codex does not flag every missing semicolon or style preference. It focuses on:

  • P0 issues: regressions, auth bypasses, security holes
  • P1 issues: missing tests, risky behaviour changes, unhandled error paths, docs gaps
  • Context-aware analysis: it sees the full PR diff, repo history, and your guidance files

No noise. No "consider renaming this variable" comments. The issues it raises are the ones a senior engineer would raise.

Your Rules, Not Generic Rules

This is the critical difference. Drop an AGENTS.md file in your repo root with a ## Review guidelines section. Codex loads the closest AGENTS.md to each changed file and follows your rules:

## Review guidelines

- Treat missing tests as P1
- Flag any new external API calls without rate-limit handling
- Verify every route uses the auth middleware
- Never allow PII logging
- Typos in docs = P1 for this repo
- Any new dependency must have a license check
- Database migrations must be reversible
- No hardcoded secrets, API keys, or credentials
- Every public function needs a docstring

Without AGENTS.md, Codex is generic. With it, Codex becomes your team's senior engineer, enforcing the exact standards you care about.

The Auto-Fix Loop

The most powerful pattern: review โ†’ fix โ†’ re-review.

  1. Codex reviews a PR and flags 3 issues
  2. You reply @codex fix it
  3. Codex spins up a cloud task, writes the fixes, and pushes directly to the PR branch
  4. Codex re-reviews its own fix (or you trigger another @codex review)
  5. Repeat until clean

Some teams script this loop so it runs automatically. The PR self-heals before a human even looks at it.


AGENTS.md Template: Ready to Copy

Here is a production-ready AGENTS.md for a typical engineering team. Customise the rules to match your standards.

# AGENTS.md

> **TL;DR:** How to set up the Codex GitHub App for automated PR reviews, auto-fix, and AGENTS.md-driven code standards. The setup top engineering teams use in 2026.

## Repository context

This is [your project name]. It is a [brief description].
The primary language is [Python/TypeScript/etc].
Tests live in [tests/ or __tests__/].
CI runs on [GitHub Actions / GitLab CI].

## Review guidelines

### Security (always P0)
- No hardcoded secrets, API keys, tokens, or credentials
- No PII in logs, comments, or error messages
- Every route/endpoint must use the auth middleware
- New external API calls require rate-limit handling
- SQL queries must use parameterised queries (no string interpolation)

### Testing (P1)
- Every new function must have at least one test
- Test coverage must not decrease
- Integration tests required for new API endpoints
- Mock external services in tests (no real API calls)

### Code quality (P1)
- No commented-out code in PRs
- Every public function needs a docstring
- No TODO comments without a linked issue
- Database migrations must be reversible
- New dependencies require a license compatibility check

### Documentation (P1 for this repo)
- README updates required when adding new features
- API changes require updated API docs
- Typos in user-facing docs are P1

### Style (informational only, do not block)
- Follow existing naming conventions
- Prefer early returns over deep nesting
- Keep functions under 50 lines where practical

Cost Reality: $20 vs $100 Plan

GitHub PR reviews count against your general Codex usage limits, the same pool as CLI and cloud tasks. There is no separate quota.

Plan Price Reviews per 5-hour window Real-world feel Monthly burn
Plus $20/month 20-50 reviews Burns fast. One complex PR can eat a quarter of your window. Not enough for active repos, you hit limits within days
Pro $100/month 5-10ร— more (promo 10ร— until May 31, 2026) Sustainable for teams. Handles heavy daily use without throttling. Handles heavy daily use across multiple repos

Recommendation: If you are using Codex for both CLI coding and PR reviews, the $20 Plus plan will run out fast. The $100 Pro plan is what most engineering teams upgrade to, the limit headroom makes the difference between "useful tool" and "indispensable teammate."


How Top Teams Use This in Practice (May 2026)

Based on patterns from power users and engineering teams:

1. Auto-Review Every PR

Enable automatic reviews on all active repos. Every PR gets a Codex review before any human looks at it. This creates a quality floor, the human reviewer only sees PRs that have already passed the AI's checks.

2. Maintain a Living AGENTS.md

Your AGENTS.md is not a one-time file. Update it when:

  • A new type of bug makes it to production (add a rule to catch it)
  • The team agrees on a new standard (codify it)
  • A false positive keeps appearing (refine the rule)

The best teams treat AGENTS.md like a living runbook. Every incident review asks: "Could we have caught this with an AGENTS.md rule?"

3. Use @codex fix Aggressively

Do not manually fix issues that Codex flagged. Reply @codex fix it and let it push the fix. This is faster and creates a verifiable fix-commit that maps directly to the review comment.

4. Codex Reviews Its Own PRs

If you have agents (Codex CLI, Claude Code, Kimi) generating PRs, Codex can review those agent-generated PRs. This creates an agent-reviewing-agent loop where the generating agent and the reviewing agent are different, reducing the risk of systemic blind spots.

5. Combine with GitHub Actions

Add a GitHub Action that triggers on PR creation and posts a summary comment with test results, lint status, and security scan findings. Codex reads these when reviewing, giving it additional context beyond the diff.


What Codex PR Review Cannot Do

Be honest about the limits:

  • Architecture decisions: Codex can flag patterns but cannot evaluate whether a design decision is right for your business context
  • Business logic correctness: it checks code, not whether the feature solves the right problem
  • Cross-PR dependencies: it reviews one PR at a time, not the interaction between PRs that should merge in order
  • Breaking change detection across services: if the PR changes an API that other services consume, Codex only sees the PR's repo
  • Emotional judgment: it cannot tell if a PR is "too ambitious" or "not worth the complexity"

These remain human decisions. The value of Codex review is not replacing humans, it is freeing humans from the 80% of review work that is pattern-matching so they can focus on the 20% that requires judgment.


Codex vs Claude Code /review vs CodeRabbit

Feature Codex GitHub App Claude Code /review CodeRabbit
Auto-review on PR creation Yes No (manual trigger) Yes
Auto-fix (push to branch) Yes (@codex fix) No No
Custom rules AGENTS.md CLAUDE.md .coderabbit.yaml
Inline comments Yes Yes (local only) Yes
Cost $20-100/month (shared pool) Usage-based (Anthropic API) $15-25/seat/month
Cloud sandbox Yes (clones repo) No (runs locally) Yes
Fix loop Yes (review โ†’ fix โ†’ re-review) No No

Codex wins when you want auto-review + auto-fix as a native GitHub integration. Claude Code /review wins for local-first teams who want review without cloud access to the repo. CodeRabbit wins for teams that want a dedicated review tool with per-seat pricing and no shared usage pool.


Frequently Asked Questions

Does Codex have access to my private code?

Yes, when you authorise the GitHub App, Codex clones your repo into its cloud sandbox. This is the same trust model as any CI/CD service (GitHub Actions, CircleCI). If your organisation prohibits third-party code access, this is a blocker.

Can Codex review PRs on GitHub Enterprise Server?

As of May 2026, the connector works with GitHub.com (cloud). GitHub Enterprise Server (on-premises) support depends on your network configuration, the Codex sandbox needs to reach your GHE instance.

What happens if Codex and a human reviewer disagree?

Nothing automatic. Codex posts comments; the PR author and human reviewer decide what to act on. Codex does not block merges, it is advisory. If you want blocking reviews, add Codex as a required reviewer in branch protection (but most teams keep it advisory).

How do I prevent Codex from reviewing certain repos?

In the Code review settings, you toggle repos individually. Repos not enabled are not reviewed. You can also add a .codex-ignore or equivalent config (check current Codex docs for the exact mechanism).

Is this worth it if I already use GitHub Copilot?

Yes. Copilot helps you write code. Codex reviews the code after it is written. They are complementary, one generates, the other reviews. The highest-performing teams use both.


Further Reading


Set This Up Today

The Codex GitHub connector takes 5 minutes to enable and an hour to configure well (mostly writing your AGENTS.md). Once it is running, every PR gets a review that follows your team's exact standards. The review backlog shrinks. The merge velocity increases. The code quality floor rises.

If your team needs help designing an AGENTS.md that captures your engineering standards, building a review automation pipeline, or evaluating whether Codex, Claude Code, or CodeRabbit is the right fit, start with an AI Readiness Assessment or explore AI Consulting.