nof0xgiven

Warp Engine

Event-sourced context management for AI coding agents. Instead of growing conversation history, Warp Engine records every action as an event and assembles fresh, deterministic context for every model call.

What is Warp Engine?

Every AI coding tool has the same hidden problem: memory is a lie.

When you ask an AI to "fix the auth bug," it reads files, makes changes, runs tests. Then you say "now add logging." The model receives the entire conversation so far — your first message, its response, file contents from ten minutes ago, tool outputs from three iterations back — and tries to work with it.

By turn five, 20,000 tokens of accumulated history. By turn ten, 50,000. Most of it stale. The model pays attention to all of it equally, because it has no way to know what's current and what's garbage.

Warp Engine takes a completely different approach. It doesn't keep conversation history. It keeps an event log. Every action — every file read, edit, command, test, decision — is recorded as a structured event in an append-only SQLite database. When the model needs to act, Warp Engine assembles fresh context from the event log. Right now. At this moment. The conversation history? Thrown away. Every turn.

Events are memory. Context is computed.

The model gets a clean, deterministic picture of reality computed from verified facts — not a growing narrative it has to parse and hope it interprets correctly. Constant context size. Perfect auditability. Every turn grounded in what's actually true right now.

How It Works

User Intent"fix the auth bug"Event Log (SQLite)UserIntent: "fix the auth bug"FileRead: auth.py L45-67 sha:abc1…FileChanged: auth.py +3/-1 linesCommandRun: pytest (exit 1)Decision: "use bcrypt"Tool Executionread · edit · bash · greppolicy checks before execstale guards · bounded outputSnapshotsevery 500 eventsfast warm startupreplay only new eventsDeterministic Projectorworking set (top 8 files, scored) · open failures (fingerprinted)active decisions · progress timeline · tool latency (P95)stale detection via SHA256 hash + stat comparisonContext Assemblertoken-budgeted · priority-packed · slice-rehydrated from diskModel Turn (fresh context, no history)tool calls → events → next turn

Traditional AI Coding vs. Warp Engine

TRADITIONAL (growing history):              WARP ENGINE (computed context):
────────────────────────────                ────────────────────────────
Turn 1:  200 tokens                         Turn 1:  ~8K tokens (assembled)
Turn 2:  600 tokens                         Turn 2:  ~8K tokens (assembled)
Turn 5:  5,000 tokens                       Turn 5:  ~8K tokens (assembled)
Turn 10: 20,000 tokens                      Turn 10: ~8K tokens (assembled)
Turn 20: 50,000+ tokens (stale, noisy)      Turn 20: ~8K tokens (fresh, verified)

Every turn, the model receives a clean projection — which files are active, what changes were made, what's broken, what decisions were taken — rehydrated from disk with SHA256 verification. Not "what we talked about." What's actually true.

Event Types

Eleven structured event types cover every action:

EventWhat it captures
SessionStartedWorking directory, repo root, active tools
UserIntentNormalised user request
ToolCalledTool name, typed input (before execution)
ToolResultStatus, bounded output (after execution)
FileReadPath, line range, SHA256 hash of content slice
FileChangedPath, full unified diff, summary (+N/-M lines)
CommandRunCommand, exit code, output excerpt
TestStatusPass/fail, error excerpts
DecisionStrategic choice with rationale (supports supersession)
RepoMapGit file inventory (HEAD, file count, top dirs/extensions)
CustomExtension-defined events

All events are append-only, bounded to 1MB max payload, and stored in SQLite with WAL mode for concurrent reads/writes.

The Stale Context Guard

This prevents the most common class of AI coding errors: editing a file based on what the model thinks is there rather than what's actually there.

Has the file been read?

If there's no FileRead event for this file, the edit is blocked. The model must read first.

Has the file changed since it was read?

The system compares the file's current mtime and size against the stats recorded at read time. If they differ, the edit is blocked.

Is the read too old?

Default: 2 hours. If the model read the file 3 hours ago and hasn't re-read it, the edit is blocked.

Blocked with rationale

If any check fails, the tool call is rejected: "File changed since last read. Re-read before editing." The block is logged as an event and surfaced in the projection.

Every edit is grounded in a verified, hash-checked read. No hallucinated line numbers. No patches applied to code that moved.

Failure Fingerprinting

Most agents treat test failures as a blob of text. Warp Engine treats them as structured, trackable entities.

When pytest tests/test_auth.py fails, Warp Engine creates a fingerprint from the command identity and a hash of the error output:

bash:pytest tests/test_auth.py:deadbeef  →  AssertionError (attempt 1)
bash:pytest tests/test_auth.py:cafebabe  →  ImportError (attempt 1)

Different errors get different fingerprints. If the model fixes the AssertionError but introduces an ImportError, both are tracked independently. The attempt counter increments on each failure.

When the command succeeds (exit 0), every fingerprint for that command is resolved at once. The model sees them in "recently resolved" and knows it just fixed two distinct issues.

The model gets something it's never had before: a precise understanding of what's broken, how many times it's tried, and whether it's making progress. Structured state derived from events, not vibes from reading old conversation history.

Tool Policy Enforcement

Warp Engine doesn't just observe — it enforces discipline. In enforce mode, violations block tool calls. In report mode, they're logged but allowed.

PolicyRuleWhy
Discovery orderMust run exact search (grep/rg/ast-grep) before semantic search (morph_grep)Prevents expensive fuzzy searches when keyword match suffices
Edit orderMust try patch-based edit before fast-apply (morph_edit)Fast-apply is fallback, not default
Read before writeEvery edit requires a verified prior readPrevents edits on unseen files
Read size limitsMax 500 lines per read (configurable)Prevents context flooding
Bounded outputsTool results truncated to 2K lines / 50K charsModel gets what it needs without drowning in noise

All policy actions are themselves logged as events, creating an explainable audit trail for every blocked action.

The Projection: What the Model Actually Sees

Instead of 50 messages of conversation history, the model receives:

[WARP ENGINE CONTEXT]

Repository: kahunas2 (abc123, 247 files)

Intent: "Add logging to the auth module"

Working Set (3 files):
  src/auth.py (read 2m ago, changed 1m ago)
    Diff: +3/-1 lines (added bcrypt import + hash call)
    Lines 45-67:
      def hash_password(pwd: str) -> str:
          return bcrypt.hashpw(pwd.encode(), bcrypt.gensalt())

  tests/test_auth.py (read 4m ago)
    Lines 12-30: [fresh excerpt from disk]

  src/config.py (read 6m ago, STALE — changed on disk)

Open Failures (1):
  pytest tests/test_auth.py — exit 1 (attempt 2)
    AssertionError: Expected hashed output, got plaintext

Decisions:
  - Use bcrypt for password hashing (industry standard)

Recently Resolved:
  - pytest tests/test_auth.py:cafebabe — ImportError (resolved)

Clean. Current. Grounded in verified facts. The working set contains fresh code excerpts rehydrated from disk — the event log stores metadata and a SHA256 hash, the assembler reads the actual file content at assembly time and verifies it matches. If the hash doesn't match, the file is marked stale. The event log stays small. The assembled context contains real code.

Token Budgeting

The context assembler operates within a strict token budget computed from model metadata:

Context window (e.g., 200K)
  − reserved output tokens (min(8192, maxTokens × 0.8))
  − reserved system/tool schemas (~12K)
  = available budget (~180K)
    → 60% allocated to dynamic content

Sections are packed in priority order with dedicated budget shares:

SectionBudget sharePriority
Header (repo, intent)Always includedHighest
Open failures35% of remainingForced — always present
Decisions20% of remainingHigh
Working set75% of remainingHigh
Resolved / progress / latencyRemainderNormal
FooterAlways includedHighest

Critical items (failures, header, footer) always appear. The greedy packing algorithm is fast and deterministic — not optimal knapsack, but predictable and provider-agnostic.

Snapshots and Retention

Snapshots for fast startup

Every 500 events, the full projection state is serialised to disk. On startup, Warp Engine loads the latest snapshot and replays only new events — no need to reprocess thousands of historical events.

Automatic pruning

On session shutdown, events older than 7 days are pruned (keeping a minimum of 2,000 events). The event log stays manageable without manual intervention.

Manual controls

/warp-prune-events supports dry-run, archive to JSONL (optionally gzipped), vacuum, and configurable minimum event floors. Archive before you prune — the full history is preserved if you need it.

Commands

CommandPurpose
/warpShow working set, open failures, decisions, DB path
/warp-dumpPrint full projection state as JSON
/warp-contextShow assembled context for the last user message
/warp-snapshotForce write a snapshot
/warp-prunePrune snapshot cache (keep N newest)
/warp-prune-eventsPrune old events (days, dry-run, archive, vacuum, min-events)
/warp-repoRebuild git-backed repo map

What Makes This Different

Constant context size

Turn 3 and turn 30 cost the same. Context is computed fresh, not accumulated. No growing token bills.

Verified state, not remembered state

SHA256 hashes prove the model's knowledge is current. Stale edits are blocked before they happen.

Perfect auditability

Every action is in a queryable SQLite log. Deterministic replay from any point. Full decision trail.

Enforced discipline

Read before write. Search before semantic search. Bounded outputs. The model follows engineering discipline, not suggestions.

Architecture

Warp Engine is built on three properties that reinforce each other:

  • Causality — what changed is auditable. Every event is immutable, timestamped, and traceable.
  • Boundedness — context remains constrained per turn. Token budgets are strict. Payloads are bounded.
  • Recoverability — state can be replayed from the log. Snapshots accelerate startup. Pruning controls growth.

The separation between raw interactions (the event log) and model-visible context (the assembled projection) is architecturally load-bearing. The log is the source of truth. The projection is a derived view. The context is a rendered document. Each layer has a single responsibility and can be evolved independently.

Integration with Pi

Warp Engine hooks into pi's extension lifecycle at six points:

HookWhat Warp Engine does
session_startInitialise session, build repo map from git ls-files
before_agent_startInject hidden control message with budget metadata
tool_callPolicy enforcement (discovery order, stale guard, read limits). Capture pre-images for diff computation
tool_resultRecord events (ToolResult, FileRead, FileChanged, CommandRun). Bound tool outputs
contextReplace conversation history with assembled warp context + current turn tail
session_shutdownWrite snapshot, prune old events

The host agent remains agnostic. Enabling or disabling Warp Engine changes behaviour without touching core agent code.

Tech Stack

ComponentTechnology
LanguageTypeScript (~1,600 lines across 4 core modules)
StorageSQLite (node:sqlite) with WAL mode
HashingSHA256 for content verification
DiffingUnix diff -u for change tracking
Token estimationDeterministic heuristic (bytes / 3.2) — no provider dependency
IntegrationPi extension hooks (lifecycle events)
Config.pi/warp-engine/config.json

Configuration

{
  "enabled": true,
  "policyMode": "enforce",
  "maxReadLines": 500,
  "requireReadBeforeWrite": true,
  "maxReadAgeMs": 7200000,
  "enforceDiscoveryOrder": true,
  "enforceEditOrder": true,
  "minEventsToKeep": 2000
}

Set policyMode to "report" to log violations without blocking. Set "enabled": false to disable per-project.

The Core Insight

Chat history is an implementation detail, not the memory. Every AI coding tool treats the growing conversation buffer as the model's memory because it's the cheapest approximation of one. But it's lossy, grows without bound, goes stale, and can't be queried, replayed, or verified.

Warp Engine replaces the approximation with the real thing: an append-only event log that captures what actually happened, a deterministic projector that computes what's true right now, and a context assembler that renders exactly what the model needs to see — fresh, bounded, and grounded in hash-verified facts.

The conversation was never the memory. The events are.

On this page