All posts
Patrick Kelly technical mcp claude-code

How Forge Gives Claude Code Structural Understanding

A technical deep-dive into server instructions, workflow tools, and smart descriptions — the three-layer architecture that lets Forge guide AI agents without supervision.


Without Forge, Claude Code gets a request to “refactor the payment flow” and starts reading files. It reads payments.ts, then billing.ts, then checkout.ts, then it guesses at related files and reads those too. Forty files later, it makes the changes and hands them back. Two call sites got missed. Three tests will fail at CI. The developer finds this out the hard way.

With Forge, the same request looks like this: Claude Code calls forge_trace_dependents("src/utils/payments.ts"), gets back a precise list of 12 dependent files, makes targeted changes to all 12, and calls forge_validate at the end. The refactor is clean on the first pass.

That difference is not about the model being smarter. It is about the model having access to structured information it could not otherwise reason about. This post explains exactly how Forge delivers that information and, more importantly, how it teaches the agent to use it.


The Problem With MCP Servers That Just Expose Tools

The MCP protocol lets you register tools with an AI agent. The agent can call those tools. This is powerful in theory. In practice, most MCP servers expose tools and then get out of the way — the agent has to figure out when to call what and why.

An agent with access to a search_codebase tool still has to know: “I should call this before I modify a file.” An agent with access to a check_imports tool still has to know: “I should call this after I make changes.” Nothing in the MCP spec teaches the agent those behaviors. The tool descriptions help, but they are read once and mostly forgotten as the context window fills up.

This is the problem Forge solves at three layers.


Layer 1: Server Instructions

When an MCP client (Claude Code, Cursor, any MCP host) connects to Forge, Forge sends a set of behavioral rules in the initialize response. These rules are injected into the agent’s system prompt automatically — the user does not have to configure anything.

Here is what those instructions look like:

You have access to the Forge codebase intelligence server.

MANDATORY: Before modifying any file, call forge_prepare(file) on that file.
forge_prepare returns a GO/CAUTION/STOP assessment. Do not proceed on STOP.

MANDATORY: After making changes to any file, call forge_validate(paths) to
confirm no broken imports or circular dependencies were introduced.

USE forge_trace_dependents when you need to know what depends on a symbol
or file before renaming, moving, or removing it.

USE forge_understand when you need a comprehensive summary of a module's
purpose, dependencies, and current health status before making architectural
changes.

These are not suggestions. They are operating procedures.

The key design choice: these are rules, not hints. The agent treats them like operating procedures because they are framed as such. Framing matters — an agent that is told “you might want to call forge_prepare” will sometimes skip it. An agent told “before modifying any file, call forge_prepare” will not.


Layer 2: Workflow Tools

The second layer is the three workflow composite tools: forge_prepare, forge_validate, and forge_understand. These are not atomic tools that do one thing. They are composite tools that run multiple analyses in parallel and return a structured summary designed to give the agent everything it needs to make a decision.

forge_prepare is the pre-modification gate. Here is an example response:

{
  "file": "src/core/license.rs",
  "assessment": "CAUTION",
  "dependents": {
    "count": 47,
    "direct": [
      "src/cli/activate.rs",
      "src/cli/serve.rs",
      "src/mcp/server.rs",
      "forge-mcp/src/tools/workflow.rs"
    ],
    "indirect_sample": ["forge-cli/src/main.rs", "..."]
  },
  "health": {
    "broken_imports": 0,
    "dead_exports": 2,
    "in_cycle": false,
    "cycle_detail": null
  },
  "coverage": {
    "line_coverage_pct": 34,
    "covered_lines": 89,
    "total_lines": 261,
    "assessment": "LOW"
  },
  "git": {
    "last_modified": "2026-04-14",
    "churn_30d": 8,
    "primary_author": "dev@example.com"
  },
  "caution_reasons": [
    "47 dependents — changes have wide blast radius",
    "Low test coverage (34%) — changes may break untested paths",
    "2 dead exports — consider cleaning up before modifying"
  ],
  "go_conditions": [
    "No broken imports in current state",
    "No circular dependency involvement"
  ]
}

The agent reads this and adjusts its plan. A 47-dependent file with 34% coverage warrants more careful changes than a leaf module with zero dependents and 90% coverage. The agent does not have to reason about what “careful” means in abstract — the CAUTION assessment with explicit reasons gives it something to work with.

forge_validate is the post-modification gate. It runs after changes and checks that no broken imports were introduced, no new circular dependencies appeared, and the files that were supposed to be updated were actually updated:

{
  "assessment": "GO",
  "checked_paths": ["src/core/license.rs", "src/cli/activate.rs"],
  "broken_imports": [],
  "new_cycles": [],
  "missed_dependents": [],
  "summary": "Changes are clean. No broken imports. No circular dependencies introduced."
}

If it returns a STOP:

{
  "assessment": "STOP",
  "broken_imports": [
    {
      "file": "src/mcp/server.rs",
      "line": 12,
      "import": "crate::core::license::validate_key",
      "reason": "Symbol was renamed to validate_license_key in the refactor but this import was not updated"
    }
  ],
  "new_cycles": [],
  "missed_dependents": ["forge-mcp/src/tools/workflow.rs"],
  "summary": "1 broken import, 1 missed dependent. Fix before committing."
}

The agent sees this, goes back, fixes the broken import, and calls forge_validate again. This loop is automatic — the agent follows the server instructions it received at connect time.


Layer 3: Smart Tool Descriptions

The third layer is the descriptions on every individual tool. MCP tool descriptions are usually a sentence or two explaining what the tool does. Forge’s descriptions are mini-pedagogies explaining not just what the tool does but when to call it and why it is better than alternatives.

Compare a naive description:

forge_trace_dependents — Finds files that import a given file or symbol.

With Forge’s actual description:

forge_trace_dependents(path, symbol?) — Trace all files that directly or transitively depend on a file or exported symbol. Call this before renaming, moving, or removing any exported symbol or file. Prefer this over grep because it understands TypeScript re-exports, Python from X import *, Rust pub use chains, and Go package aliases that grep cannot reliably trace. Returns a ranked list sorted by dependency depth, with the most directly dependent files first.

The second description teaches the agent three things: when to call it (before rename/move/remove), why it is better than the alternative (re-exports and import aliases), and how to interpret the output (ranked by depth). That is three fewer reasoning steps the agent has to perform from scratch during a task.

This design compounds: 21 tools with smart descriptions means 21 fewer reasoning gaps for the agent to fill in.


A Real Agent Transcript

Here is what an actual Forge-assisted refactor session looks like in Claude Code, condensed:

User: Rename the LicenseError enum to LicenseValidationError throughout the codebase.

Claude: I'll trace dependents first to understand the full scope.

[calls forge_trace_dependents("crates/forge-core/src/license.rs", "LicenseError")]

forge_trace_dependents returned 9 files that reference LicenseError directly:
  - forge-cli/src/commands/activate.rs (line 23, 31, 45)
  - forge-cli/src/commands/license.rs (line 12, 18)
  - forge-mcp/src/tools/workflow.rs (line 67, 89, 112)
  - forge-mcp/src/tools/health.rs (line 44)
  - forge-core/src/license.rs (definition, lines 8-14)
  + 4 test files

[calls forge_prepare("crates/forge-core/src/license.rs")]

forge_prepare returned CAUTION: 47 dependents total, low coverage (34%).
Proceeding with the rename across all 9 direct references.

[makes changes to all 9 files]

[calls forge_validate(["crates/forge-core/src/license.rs", ...])]

forge_validate returned GO. No broken imports. No missed dependents.
Rename is complete and clean across all 9 files.

Without Forge, the same task would likely miss some of the 9 files — agents reading the codebase via file listing and search typically hit 5-6 of them and miss the rest. With Forge, the list is complete because forge_trace_dependents is doing AST-level symbol tracking, not text search.


Why Self-Hosted Matters for This Architecture

The server instructions and workflow tools only work well if the tool responses are fast. forge_prepare runs dependency tracing, health checks, coverage lookup, and git history in parallel. On a 600-file TypeScript codebase, it returns in under 400ms.

That latency is only achievable because Forge runs locally. The index is SQLite on your machine. There is no round-trip to a cloud API. No authentication overhead. No rate limiting. No token cost for the analysis itself.

If Forge were a cloud service, every forge_prepare call would add 1-3 seconds of API latency per invocation. A refactor that calls forge_prepare on 8 files before modifying them would spend 8-24 seconds on network overhead alone — enough to meaningfully degrade the user experience and make developers start skipping the call.

Self-hosted is not just a privacy story. It is a performance story.


Try It

If you use Claude Code, Cursor, or any MCP-compatible agent, you can try Forge today. Community Mode lets you index one repo and search it from the CLI. To integrate Forge with your AI agent and unlock all 21 tools (including the workflow composites that make the three-layer architecture work), start a 14-day free trial — no charge until day 15.

Download at forge.ironpinelabs.com. Run forge setup, then forge index . on your repo. Add it to your MCP config. Run one refactor. Measure the delta in your own workflow — that is more useful than any benchmark I could publish here.

Questions and technical feedback welcome at support@forge.ironpinelabs.com and GitHub Discussions.