Plandex
Terminal-based AI coding agent for large projects with diff review sandbox and 2M token context
Open-source AI dev tooling, honestly reviewed. No marketing fluff, just what you get when you self-host it.
TL;DR
- What it is: Open-source (MIT) terminal-based AI coding agent — think Claude Code or Cursor, but fully self-hosted, model-agnostic, and designed for multi-file tasks that span entire codebases [README][5].
- Who it’s for: Technical founders and software developers who are tired of paying per-seat AI subscriptions, need a coding agent that handles large real-world projects, and want control over which models they use [README].
- Cost savings: Cursor Pro is $20/mo per developer. GitHub Copilot is $19/mo per seat. Plandex self-hosted is the cost of the API calls you’d make anyway — no seat fee on top [README].
- Key strength: A five-level autonomy system and a diff review sandbox that stages AI-generated changes separately from your project files until you approve them — so the agent can’t silently break things [1][2][5].
- Critical caveat: Plandex Cloud is actively winding down [homepage]. If you’re evaluating Plandex, you are evaluating self-hosted Plandex. There is no managed cloud fallback.
- Key weakness: Purely terminal-based. If your team lives in a GUI IDE, the learning curve is real. There are no visual diffs in the editor, no inline suggestions, no sidebar — just a shell and a pager [3].
What is Plandex
Plandex is a CLI-first AI coding agent that runs in your terminal and works across your entire project directory. You give it a task in plain English — “add pagination to the user list endpoint and update the frontend to match” — and it breaks the task into steps, loads the files it needs, writes the changes, and accumulates them in a sandbox until you’re ready to review and apply [README][4].
The project’s actual pitch is less glamorous than most AI coding tools: it’s designed for the tasks that make other agents fall apart. Large codebases with hundreds of files. Changes that require touching a dozen modules in sequence. Debugging loops that run npm test fifteen times before the types stop complaining. The README opens with: “Plandex is designed to be resilient to large projects and files. If you’ve found that other tools struggle once your project gets past a certain size or the changes are too complex, give Plandex a shot.” That’s a specific and honest pitch [README].
The context management backs it up: Plandex supports up to 2M tokens of effective context via chunking and project maps generated with tree-sitter, which parses your code structurally across 30+ languages. If you load a full project that’s 20M+ tokens, Plandex generates a map of the structure rather than stuffing raw files into the context window — so it can reason about where things live without paying for every line of every file in every request [README].
The project sits at 15,092 GitHub stars with 1,000+ forks. MIT-licensed. The cloud product is winding down, so the active development focus is on the self-hosted path [merged profile][homepage].
Why people choose it over Cursor, Claude Code, and Aider
The competitive field here is crowded: Cursor, GitHub Copilot, Claude Code, Aider, Cline, Continue.dev. Plandex’s differentiation comes down to three bets.
The diff sandbox bet. Plandex doesn’t touch your working files until you approve. Every AI-generated change accumulates in a version-controlled sandbox — you review the diff, reject individual files if they’re wrong, and then apply [2]. At the Full autonomy level it can auto-apply, but at lower levels you see everything before it lands. Cursor and Copilot apply changes inline in real time, which is fast but means you’re constantly undoing mistakes. Aider applies changes directly to the tracked files. Plandex’s model means you can let it run a 20-step plan and then sit down to review a single consolidated diff [2][5].
The autonomy configuration bet. Plandex offers five named autonomy levels — None, Basic, Plus, Semi, and Full — each toggling different automation behaviors: auto-loading context, auto-continuing multi-step plans, auto-applying changes, running commands, and auto-debugging [5]. You don’t have to pick between “it does everything” and “I approve every token.” A developer who’s cautious uses Plus (smart context, manual apply). A developer who trusts the model uses Full (auto-apply, auto-exec, auto-debug). Both use the same tool [5].
The model-agnostic bet. Plandex isn’t locked to one provider. It supports Anthropic, OpenAI, Google, DeepSeek, and open-source providers. You can mix models per role — a reasoning model for planning, a faster cheap model for implementation, a strong model for debugging. The CLI exposes named model packs (—daily, —strong, —cheap, —oss, —reasoning) and lets you override individual roles [3]. Cursor is primarily OpenAI + Anthropic. Claude Code is Anthropic. Plandex gives you the picker.
Features: what it actually does
Context management:
- 2M token effective context window using chunk-based loading [README]
- Tree-sitter project maps for 30+ languages — structural understanding of large codebases [README]
- Smart context: automatically loads relevant files per task step, drops what’s no longer needed [5]
- Context caching for OpenAI, Anthropic, and Google models to reduce cost and latency [README]
plandex add/plandex rmfor manual context control; auto-load in Semi and Full modes [3][5]
Diff sandbox:
- All changes staged separately from project files until
plandex apply[2] plandex diffshows changes in git-diff format;plandex diff --uiopens a browser-based side-by-side viewer [2]- Reject individual files with
plandex reject file.ts; changes not applied until you confirm [2] - Full conversation history with
plandex convoso you can copy raw output if the apply went sideways [2]
Execution and debugging:
plandex debug 'npm test'runs a command and iterates — send failure output to the model, apply fixes, retry, up to N times (default 5, configurable) [1]- Supports any shell command:
npm run build,go test ./...,pytest,tsc --noEmit[1] - Browser debugging: if Chrome is installed, Plandex can catch JS errors and read console logs automatically [1]
can-execandauto-execconfig flags control whether commands run at all and whether they run automatically after apply [1][5]
Autonomy system:
- Five levels from None (fully manual) to Full (auto-load, auto-apply, auto-exec, auto-debug, auto-commit) [5]
- Set at plan creation, per-plan, or as a default for new plans [3][5]
- Full autonomy is powerful but the docs warn explicitly: “Be extremely careful with full auto mode! It can make many changes quickly without any prompting or review, and can run commands that could potentially be destructive to your system.” [5]
Model flexibility:
- Named model packs via CLI flags (—daily, —strong, —cheap, —oss, —reasoning, —gemini-planner, —o3-planner, —r1-planner, —opus-planner) [3]
- Mix providers within a single plan — planner on one model, implementer on another [3]
- Context caching used across supported providers [README]
Workflow tools:
- REPL mode (
plandexorpdx) with hotkeys, streaming output, background/foreground task management [3][4] - Chat mode (
plandex chat) for questions without generating changes [4] plandex tellfor implementation tasks; accepts inline strings,--file/-f, or piped output [4]- Git integration: auto-commit on apply at Plus+ autonomy levels [5]
- Plan branching and rollback built into the sandbox model [README]
Pricing: SaaS vs self-hosted math
Plandex Cloud: Winding down. No new accounts. Existing plans migrate to self-hosted. This is not a concern for new users — it simply isn’t an option [homepage].
Self-hosted (Community Edition):
- Software license: $0 (MIT) [README]
- VPS to run it: $5–10/mo on Hetzner, Contabo, or a machine you already have
- API costs: whatever you’d pay for the models directly — no Plandex markup
For comparison, what you’re replacing:
| Tool | Pricing |
|---|---|
| Cursor Pro | ~$20/mo per seat |
| GitHub Copilot | $19/mo per seat |
| Claude Code (Anthropic) | Usage-based, no seat fee |
| Aider | Free, self-hosted, API costs only |
| Plandex self-hosted | Free, API costs only |
For a two-person dev team, Cursor Pro is $480/year. Plandex self-hosted on a $6 Hetzner VPS is $72/year for the server, plus your API bills. If you’re already paying for Anthropic or OpenAI API access, you’re effectively replacing $480/year in seat fees with $0 in seat fees. The API costs are the same either way.
The honest caveat: at high usage, API costs for an autonomous agent running multi-step plans with 2M-token context can add up. Context caching reduces this significantly for Anthropic and OpenAI, but if you run Plandex in Full auto on a large refactor, check your API dashboard [README].
Deployment reality check
Plandex is a Go binary and an optional server. The local mode quickstart (documented at docs.plandex.ai/hosting/self-hosting/local-mode-quickstart) runs without a server — just the CLI pointing at your API keys. The server mode adds multi-user support, plan persistence across machines, and team features.
What you need for local/single-user mode:
- A machine with Go installed (or a pre-built binary)
- API keys for at least one provider (Anthropic, OpenAI, Google, etc.)
- A terminal you’re comfortable using
What you need for server mode:
- A Linux VPS
- Docker or a Go build environment
- PostgreSQL for plan/context persistence
- An SMTP provider if you want email-based auth
For a solo developer, local mode is a 30-second install per the README (“30-Second Install” link in the header). For a small team sharing plan state, the server adds maybe 30–60 minutes of setup on a fresh VPS if you’re comfortable with Docker.
What can go wrong:
- The autonomy system, especially Full mode, can and will execute shell commands. The docs are explicit about this risk [5]. If you run Full auto with auto-exec on a plan that touches infrastructure scripts, you want a clean git state and an isolated branch first.
- Model packs assume API access to multiple providers. The “daily” pack and others mix Anthropic/OpenAI/Google. If you’re restricted to one provider, you’ll configure a custom pack [3].
- The browser debugging feature requires Chrome installed locally — it’s not available in pure server/headless environments [1].
- Plandex Cloud shutdown means there’s no escape valve if self-hosting breaks. You’re on your own for maintenance. Check the Discord (700+ members) for community support [homepage].
Pros and cons
Pros
- MIT-licensed with zero seat fees. Self-host, fork, embed — no vendor agreement needed. For teams paying per-seat on Cursor or Copilot, the savings are direct and immediate [README].
- Diff sandbox is genuinely useful. AI-generated changes staged separately until you approve is a real safety mechanism, not just a checkbox [2]. At lower autonomy levels you always get a review step before anything lands.
- Five-level autonomy system. The granularity is real — each level has specific behavioral toggles [5]. You’re not choosing between “babysit every step” and “let it run wild.”
- Model-agnostic. Named packs for DeepSeek, Gemini, Perplexity, Claude, OpenAI o3 — switch planners without reconfiguring your whole setup [3]. Avoid model lock-in.
- Context caching across providers. Cuts cost and latency on long sessions [README].
- Automated debug loop.
plandex debug 'npm test'is a genuinely useful primitive — run, fail, fix, retry, up to N times, automatically [1]. Most coding agents make you manually re-prompt after failures. - Tree-sitter project maps. Structural understanding of large codebases without stuffing every file into context [README]. Handles real-world project scale.
- Pipe-friendly.
git diff | plandex tellworks. You can pipe output from any command into a prompt [4]. Integrates cleanly into shell workflows.
Cons
- Terminal-only. No VS Code extension, no Cursor-style inline diffs, no sidebar chat. If your developers are GUI-first, adoption friction is real. There’s no way around this — Plandex is a CLI tool [3].
- Cloud is gone. Plandex Cloud is winding down [homepage]. There’s no managed hosting option. You self-host or you don’t use it.
- Full auto mode is genuinely risky. The docs say it explicitly [5]. Auto-exec running destructive commands on a live repo is possible. The responsibility for safe use is on you.
- API costs at scale are opaque. A 2M-token context window sounds impressive, but running multi-step plans at that scale against premium models can generate large bills. Plandex doesn’t show you cost per plan. Budget awareness requires watching your provider dashboard.
- Small community relative to Cursor/Copilot. 700+ Discord members is not a large support base [homepage]. Official docs are solid but community Q&A is thin compared to established tools.
- No offline/local LLM support out of the box. The open-source model pack uses provider APIs, not locally-running models. If you want Ollama integration you configure it yourself — Plandex doesn’t ship that [README].
Who should use this / who shouldn’t
Use Plandex if:
- You’re a technical founder or developer paying per-seat for Cursor, Copilot, or similar — and you want the same capability without the recurring seat fee.
- You work on large codebases where other AI agents lose the thread — Plandex is specifically designed for this [README].
- You prefer terminal workflows and want a coding agent that integrates cleanly with shell scripts, piped commands, and git.
- You want to mix AI providers without committing to one vendor’s ecosystem.
- You need the review-before-apply model — you want to see consolidated diffs rather than inline edits applied in real time.
Skip it if:
- You want a GUI coding assistant. Cursor, Copilot, or Continue.dev are better fits for IDE-first developers.
- You need a managed cloud option. Plandex Cloud is winding down [homepage] — self-host or don’t use it.
- Your team has no one comfortable maintaining a Go binary and CLI tooling. The setup is simple but it’s still a terminal tool.
- You want local LLM support without configuration work. There’s no Ollama integration out of the box.
Alternatives worth considering
- Aider — the most direct open-source comparison. Also terminal-based, also MIT-licensed, also API-cost-only. Aider applies changes directly to tracked files (no diff sandbox). Smaller feature surface, simpler mental model. Good starting point if Plandex feels like too much.
- Claude Code — Anthropic’s official terminal agent. Powerful, deeply integrated with Claude models, but locked to Anthropic. No model mixing, no self-hosting the agent itself.
- Cursor — the dominant GUI coding agent. Better IDE integration, larger community, harder to self-host the core logic. $20/mo per seat.
- Continue.dev — open-source VS Code/JetBrains plugin. More IDE-native than Plandex, less capable as a multi-step autonomous agent. Free to self-host.
- Cline / RooCode — VS Code extension with similar autonomous coding agent capabilities. GUI-based, model-agnostic. Growing community.
- GitHub Copilot — the incumbent. Largest install base, deepest IDE integration, closed-source, per-seat pricing, no autonomy system to speak of.
For a technical founder trying to cut per-seat AI tooling costs: the realistic shortlist is Plandex vs Aider. Choose Plandex if you want the autonomy configuration system and diff sandbox. Choose Aider if you want maximum simplicity.
Bottom line
Plandex makes a specific and honest bet: that real software development happens in large projects across dozens of files, and that most AI coding tools fall apart at exactly that scale. The 2M-token context window, tree-sitter project maps, diff sandbox, and five-level autonomy system are all aimed at that specific problem. The MIT license and zero seat fees mean you pay for the AI inference you’d pay for anyway — nothing more. The catch is that Plandex Cloud is winding down, which means self-hosting is the only path, and the tool is unambiguously terminal-first. For developers already comfortable in the shell and tired of recurring seat fees, it’s worth an afternoon of setup. For anyone who needs a GUI or a managed cloud option, look elsewhere.
Sources
- Execution and Debugging | Plandex Docs — docs.plandex.ai. https://docs.plandex.ai/core-concepts/execution-and-debugging/
- Pending Changes | Plandex Docs — docs.plandex.ai. https://docs.plandex.ai/core-concepts/reviewing-changes/
- CLI Reference | Plandex Docs — docs.plandex.ai. https://docs.plandex.ai/cli-reference/
- Prompts | Plandex Docs — docs.plandex.ai. https://docs.plandex.ai/core-concepts/prompts/
- Autonomy | Plandex Docs — docs.plandex.ai. https://docs.plandex.ai/core-concepts/autonomy/
Primary sources:
- GitHub repository and README: https://github.com/plandex-ai/plandex (15,092 stars, MIT license)
- Official website: https://plandex.ai
- Local mode self-hosting quickstart: https://docs.plandex.ai/hosting/self-hosting/local-mode-quickstart
Related Developer Tools Tools
View all 181 →Neovim
97KThe hyperextensible Vim fork that rewards the time you invest — sub-100ms startup, modal editing, total customization, and no licensing fees.
Hoppscotch Community Edition
78KOpen-source API development ecosystem — lightweight, fast alternative to Postman with REST, GraphQL, WebSocket, and real-time API testing.
code-server
77KRun VS Code on any machine and access it through a browser — code from your iPad, Chromebook, or any device with a web browser.
Appwrite
55KOpen-source backend-as-a-service with authentication, databases, storage, functions, and messaging. Self-hosted Firebase alternative for web and mobile apps.
Gitea
54KLightweight, self-hosted Git service with code hosting, pull requests, CI/CD, package registry, and project management. GitHub alternative that runs on a Raspberry Pi.
Gogs
48KA painless, lightweight, self-hosted Git service written in Go. Minimal resource usage, easy setup, and runs on anything from a Raspberry Pi to a VPS.