Agentic Use (Core & MCP)

CURD operates fundamentally as a deterministic backend for AI code agents. It is built with zero-trust principles, allowing you to secure your codebase exactly how you prefer.

In CURD v0.7.1, there is a Unified Control Plane. The CLI, REPL, Python/Node bindings, and the MCP server all sit on the exact same validation and routing layer. This means MCP does not get a “privileged lane” to bypass organizational constraints, and the CLI cannot circumvent the policy engine.

There are two primary paradigms for integrating agents, depending on your desired level of autonomy and security:

Paradigm 1: Compiled Planning (The Safest Route)

For the highest level of human oversight, you can configure AI agents to purely generate .curd execution plans rather than modifying your code directly.

In this workflow, the agent outputs an intent script that must be compiled into a strictly governed plan artifact.


# Read what the agent wants to do
curd run check agent_proposal.curd
 
# Compile into a locked execution artifact
curd run compile agent_proposal.curd
 
# Edit the compiled defaults to enforce stricter boundaries (e.g. lowering retry limits)
curd plan edit <plan_id>

This guarantees the AI cannot accidentally break your repository without an explicit, verifiable intent graph.

Paradigm 2: Live Autonomy (The MCP Sandbox)

The simplest and most standardized way to grant an AI model active, autonomous access to your repository is via the Model Context Protocol (MCP).

Initialization

Ensure CURD is initialized in the workspace, and run:


curd init-agent --name my_agent
curd mcp .

This boots a persistent structured JSON-RPC server listening via stdio.

The `tools/list` Metadata

CURD’s MCP tools/list implementation provides rich metadata to connected agents, including:

Capability Atoms: Maps the tool to an explicit right (e.g., search maps to lookup).
Runtime Ceilings: Whether a tool is available under the restrictive lite ceiling.
Session/Approval Hints: Indicates whether a tool requires an active session or a human review prompt.

Agents respecting this metadata adapt safely to the host’s constraints without relying on brittle prompt engineering.

The Policy Engine & Workspace Sessions

CURD v0.7.1 enforces a strict security hierarchy on every transaction:

Hard Security: Config tampering, .curd/ internal access, and Sandbox jailbreaks are blocked.
Runtime Ceilings: The lite profile outright disables mutation tools, restricting the agent to purely read and search behaviors.
Blocklist & Allowlist: Explicit glob paths defined in settings.toml.
Mode Fallback: strict (Default Deny), audit, or permissive.

Mandatory Workspace Sessions

Mutating tools (like edit), executing mutating DSLs, or executing compiled plans require an active workspace session.

An agent cannot silently patch code in the background. It must wrap its logic:


curd workspace begin
# ... execute edits ...
curd workspace commit

Mandatory Plan Gating

For organizations requiring total transparency, CURD can enforce Mandatory Planning.


[policy]
require_plan_for_mutations = true

If enabled, agents are physically blocked from calling any mutation tool unless it is part of a registered and validated plan artifact.

Binary Firewall (Shell Control)

Teams can restrict the exact binaries an agent is allowed to execute via the shell tool.


[policy]
allowed_binaries = ["cargo", "npm", "pytest"] # Agents cannot run 'rm', 'sed', etc.
block_shell_metachars = true # Blocks ;, &, |, `, $( )

Semantic Backtrace Propagation & Graph Poisoning

When an agent executes a build or test command via the shell tool using curd build or curd test, CURD intercepts the process stderr.

Instead of dumping raw terminal text back to the model, CURD performs deterministic parsing of compiler diagnostics. It maps failures directly back to the AST and poisons the specific node in the Semantic Graph.

The agent can then query the graph or run curd test (the new cohesion audit flow) to instantly identify the specific function that failed, seeing its exact blast radius without ever reading a log file.