Your Coding Agent Isn’t Dumb. It’s Rummaging.
How CodeGraph cuts an AI agent’s tool calls by ~70% by handing it a map of your codebase instead of making it grep blind, and the fine print on when that actually helps.
Point an AI coding agent at a codebase it has never seen and watch what it does. It greps for a keyword. Opens a file. Wrong one. Opens three more. Reads entire files just to locate a single function. Every one of those is a tool call, and every tool call ships tokens to the model and bills you for them.
The model isn’t confused. It’s rummaging. And on a large repo, that rummaging, not the actual thinking, is most of your bill.
That’s the problem CodeGraph sets out to kill. It’s an open-source tool that went from a quiet launch in January to nearly 30,000 GitHub stars, most of them in a single week on the trending page. The pitch behind that climb fits in one sentence: before your agent asks anything, CodeGraph scans the whole repo once and builds a map of every symbol and how they all connect, so the agent can query the map instead of reading files.
How the map gets built
The pipeline comes down to three moves.
Parse it. CodeGraph runs tree-sitter, the same parser your editor uses for syntax highlighting, across more than 20 languages, from TypeScript to Rust to Swift. Tree-sitter turns each file into a syntax tree; CodeGraph walks that tree and pulls out the symbols that matter: every function, class, method, type, and import. No model, no embeddings, just a parser reading structure.
Connect it. Each symbol becomes a node, and CodeGraph tracks 23 kinds of them, from functions and classes all the way to enums and route handlers. Then it draws the edges: this function calls that one, this class extends that one, this file imports from over there. Twelve relationship types in all. Stack thousands of those together and you get a queryable picture of how the codebase actually fits together.
Store and serve it. The whole graph drops into a plain SQLite database on your machine, with full-text search bolted on so a symbol can be found by name instantly. CodeGraph then hands your agent about ten tools over MCP, like asking for context, tracing callers and callees, or mapping the blast radius of a change. The agent calls those instead of grep: one query rather than fifty file reads. A file watcher re-indexes whatever you touch as you save, using native OS file events, so the map doesn’t rot while you work.
For a project with this many stars, the core is refreshingly readable. The orchestrator is essentially a single file of around a thousand lines that pulls in the extractor, the graph, the database, and the MCP server and binds them into one engine that can build a knowledge graph from any codebase.
Isn’t this just RAG for code?
This is the interesting part, because the answer is no, and the difference is the whole point. RAG guesses what’s relevant by similarity: embed the code, drop it in a vector database, retrieve the nearest matches. A graph doesn’t guess. It parsed the actual relationships, so when you ask who calls this function, it returns an exact answer rather than a hunch. There are no embeddings, no vector database, no API keys, and nothing leaves your machine.
And to be clear about what it does and doesn’t do: none of this makes the model smarter. It just stops it from rummaging.
Does it actually save anything?
According to the project’s own benchmarks (same model, seven real codebases, run with and without it), CodeGraph came out roughly 35% cheaper with about 70% fewer tool calls. On Excalidraw, the agent went from 79 tool calls down to three.
Now read the fine print, and credit to the project, because its docs do too:
- The gains scale with size. On a small project, native grep is already cheap and the savings nearly vanish.
- The numbers are the author’s own, measured on a single model, and the project is still pre-1.0.
- The README and the package file don’t even agree on the exact figures.
- If your agent hands work off to file-reading sub-agents, CodeGraph becomes pureoverhead. Those are the project’s own words.
The verdict
So here’s the call. If your agent keeps re-reading the same giant codebase every session,
CodeGraph is worth a star and an afternoon. Wiring it up is a single
npx
@colbymchenry/codegraph . If your repo already fits comfortably inside the model’s context window, you probably don’t need it.
Either way, the underlying lesson holds. AI coding agents aren’t slow because the model is slow. They’re slow because they spend most of their time figuring out where things live. Solve that once, locally, keep the map fresh, and the agent gets to spend its tokens on the work that actually matters.
