Agentmemory — Persistent memory for AI coding agents

Introduction: A New Kind of Coding Memory agentmemory banner image

In the fast-evolving world of AI coding agents, memory matters more than ever. agentmemory is a persistent memory layer designed for AI coding agents, so your agent remembers everything between sessions. It eliminates the perennial re-explanation loop and keeps your preferences, bugs, and architectural decisions alive across sessions. Built on the iii engine, agentmemory provides a unified memory server that can be wired to Claude Code, GitHub Copilot CLI, Cursor, Gemini CLI, Codex CLI, Hermes, OpenClaw, pi, OpenHuman, OpenCode, and many MCP clients. In short: one memory server, many agents, limitless continuity.

Banner and quick navigation

This project is powered by iii and ships with a full suite of integration guides, quick starts, and real-time viewing tools.
The banner image above sets the stage for a memory-driven approach to coding assistants.

Overview: What agentmemory Does

Persistent memory across sessions: agentmemory quietly captures what your agent does, compresses it into searchable memory, and injects the right context when the next session starts. No re-teaching needed.
Cross-agent memory sharing: one memory server serves every agent that supports MCP or REST API. Memories are shared across all wired tools and clients.
Twelve hooks, 53 MCP tools, and eight skills: agentmemory comes with a rich toolkit that lets agents remember, recall, and reason about past work without manual intervention.
Real-time visibility: a built-in viewer and iii console provide live insight into what the agent remembers and how it processes memory operations.

Images that illustrate the vision

Banner: agentmemory banner image (banner.png) reinforces the idea of persistent memory.
Demo and statistics: a quick glance at the demo, retrieval metrics, and tooling counts show practical benefits.
Real-time view: the live viewer and console images illustrate how memory is captured, indexed, and accessed.

Works with Every Agent: A Universal Memory Layer agentmemory is designed to work with any agent that supports hooks, MCP, or REST API. It presents a single memory server that all compatible agents can share.

Key compatible agents and integrations

Claude Code: native plugin with 12 hooks and MCP support for seamless memory integration.
Codex CLI: native plugin that exposes a comprehensive set of memory tools via MCP and plugin hooks.
GitHub Copilot CLI: MCP-based integration with a plugin that wires Copilot to the memory server.
OpenClaw: native plugin with MCP support and memory tools wired in.
Hermes: native plugin with memory tooling integrated.
pi: native plugin with a memory-backed workflow.
OpenHuman: a Memory trait backend that connects to the memory server.
Cursor: MCP server that connects to agentmemory for live memory and recall.
Gemini CLI: memory-capable MCP server connection for rapid recall.
OpenCode: 22 hooks plus MCP and plugin support to tap into memory.
Cline: MCP server integration for memory-backed sessions.
Goose, Kilo Code, Aider, Windsurf, Roo Code, Warp, and more: broad coverage across many agents, with standard MCP blocks and plugin wiring.
OpenCode, Warp, and others: supports standard MCP blocks and plugin-based extension.

What changes with agentmemory

You set up JWT authentication in a session; in subsequent sessions, the memory of the session context, tests, and preferences is already present.
The system supports agent isolation modes via AGENTMEMORYAGENTSCOPE, enabling multi-agent setups where memory is scoped by agent identity.
Memory is persistent, structured, and searchable across sessions, so the agent can recall history, test results, and preferences without re-teaching.

Quick Start: Try It in 30 Seconds

Install the memory server globally:
npm install -g @agentmemory/agentmemory
Start the memory server and seed data:
npx @agentmemory/agentmemory
npx @agentmemory/agentmemory demo
Wire a target agent (for example Claude Code):
agentmemory connect claude-code
Optional: install native skills to empower your agent with tool-use awareness:
npx skills add rohitg00/agentmemory -y
Alternative: use npx for a quick run without global install:
npx @agentmemory/agentmemory
Visit the real-time viewer at http://localhost:3113 to watch memory indexing in action.

Important setup notes

If you used an older release with npx, you may need to force the latest version:
npx -y @agentmemory/agentmemory@latest
If you run into caching issues on macOS/Linux, clearing the npx cache can help:
rm -rf ~/.npm/_npx

Session Replay and Memory Lifecycle

Every session you record is replayable in the viewer. The Replay tab lets you scrub through prompts, tool calls, and tool results, with play/pause, speed control, and keyboard shortcuts.
Import existing Claude Code JSONL transcripts to the viewer to enrich memory with prior work.
The memory lifecycle features four tiers of consolidation: Working, Episodic, Semantic, and Procedural memory. This mirrors human memory: raw observations, session summaries, extracted patterns, and repeatable workflows.

Benchmarks and Performance Highlights

Retrieval accuracy and recall
In-house experiments show a strong retrieval signal for historical coding work, with high precision in top matches.
The agentmemory hybrid approach achieves high recall and precise top-K matching, improving the usefulness of retrieved memories for subsequent sessions.
Token efficiency and cost
Agentmemory demonstrates significant token savings by compressing and indexing memories rather than pushing raw context into every prompt.
Local embeddings and BM25-based retrieval complement dense embeddings to provide robust performance with lower operational costs.
Integrations and test coverage
The system ships with more than 50 native agents via the skills framework, and 53 MCP tools when connected to a live memory server.
A comprehensive battery of tests (950+ tests) ensures stability across MPC interactions and memory operations.

vs Competitors: Why AgentMemory Stands Out

agentmemory combines a memory engine with an MCP server, offering a unified memory layer that works across all agents and adapters.
It delivers higher R@5 retrieval rates and more complete cross-agent memory sharing, compared with alternatives that offer only partial memory or rely on manual memory editing.
The four-tier memory consolidation model (Working, Episodic, Semantic, Procedural) provides a richer and more adaptable memory structure than simple sticky notes or single-context memories.
Real-time viewer and iii console provide comprehensive observability into memory operations, traces, and workflows.

How It Works: The Memory Pipeline

Memory pipeline overview
PostToolUse hook triggers - deduplication with SHA-256, privacy filtering to strip secrets, and storage of raw observations.
Compression and extraction of structured facts, concepts, and narrative.
Embedding creation using multiple providers (local and external) and indexing via BM25 and vector search.
Session-level summarization and knowledge-graph extraction when enabled.
SessionStart hook loading project profiles and attacking a hybrid search space (BM25 + vector + graph) for optimal recall.
The 4-tier consolidation model
Working: Real-time raw observations from tool usage.
Episodic: Compressed session summaries.
Semantic: Extracted facts and patterns.
Procedural: Workflows and decision patterns.
Privacy-first approach
Secrets and keys are stripped before storage; sensitive data is protected while retaining useful memory signals.

Key Capture Points: What gets stored

SessionStart: project path, session identifier.
UserPromptSubmit: user prompts (privacy-filtered).
PreToolUse: file access patterns and enriched context.
PostToolUse: tool name, inputs, outputs.
PostToolUseFailure: error context.
PreCompact: memory re-injection before compaction.
SubagentStart/Stop: sub-agent lifecycles.
Stop: end-of-session summary.
SessionEnd: overall session completion signal.

MCP Server: The 53-Tool Memory Toolkit

53 tools, 6 resources, 3 prompts, and 8 skills
Core tools (always available): a core set including memoryrecall, memorysave, memorysmartsearch, memorysessions, memoryexport, memoryaudit, memorygovernance_delete, and more
Extended tools (when connected to a live memory server): graphquery, memoryconsolidate, memoryClaudebridgesync, memoryteamshare, memoryteamfeed, memorysnapshotcreate, memoryactioncreate, memoryaction_update, and many more
Standalone MCP: a lightweight shim for agents that cannot reach the host, enabling a consistent MCP interface with a local fallback set of tools
Cross-agent sharing: MCP enables a shared memory layer so that multiple agents can benefit from a single, persistent memory store

Real-Time Viewer and iii Console: Observability in Action

Real-Time Viewer
Auto-starts on port 3113; live memory stream, session explorer, memory browser, and knowledge graph visualization
The REST-served endpoint follows Bearer-token rules when an agent memory secret is present
iii Console
The console provides a live, architectural view of memory operations, including traces, flows, and function invocations
Two windows: one for memory state and one for the iii engine, making it easy to debug memory operations
The console supports flowing through memorysmartsearch, BM25 scans, embedding lookups, and re-ranker processes in a waterfall visualization
The Workers page shows connected workers, including agentmemory, with metadata such as PID and runtime
The Traces page provides a waterfall or flame view for per-span durations
Visual cues
The console and viewer work together to provide a complete picture of memory operations and their impact on agent behavior

agentmemory is already a running iii instance
Three primitives define the runtime: worker, function, trigger
iii’s ecosystem replaces conventional stacks (Express/Fastify, Redis, Postgres, etc.) with unified primitives
Observability and state management come from iii-state, iii-stream, and iii-observability workers
Extending agentmemory is as simple as adding an iii worker, for example:
iii worker add iii-pubsub
iii worker add iii-cron
iii worker add iii-queue
iii worker add iii-observability
iii worker add iii-sandbox
iii worker add iii-database
iii worker add mcp
This approach means you don’t need separate monitoring, queues, or database layers—the iii components handle it all

Configuration and Local Models: Flexible, Local, and Cost-Aware

LLM providers
agentmemory auto-detects providers and supports a local no-LLM mode by default
Local models (Ollama, LM Studio, vLLM, or llama.cpp) can be used to run compression and reasoning on-device
Local models and selection
The guide includes recommended models for memory tasks: qwen2.5-coder:7b, llama3.2:3b, mistral:7b-instruct, and others
Cost-aware model selection helps balance performance and price for memory compression and reasoning
Embeddings
Local embeddings are shipped with the system via @xenova/transformers for offline operation
Options include Gemini embeddings, OpenAI embeddings, and other providers depending on configuration
AGENTMEMORY settings
AGENTMEMORYURL and AGENTMEMORYSECRET control remote access and security
AGENTMEMORY_TOOLS controls surface (core vs all 53 tools)
Graph extraction and consolidation toggles allow you to tune memory behavior
Multi-agent memory with AGENT_ID
AGENTID and AGENTMEMORYAGENT_SCOPE provide two modes: shared (default) or isolated for strict separation
Isolated mode filters recall results by agent tag to prevent cross-agent leakage

API and Endpoints: A Rich REST Surface

125 endpoints on port 3111
Key endpoints include health checks, session management, memory capture, hybrid search, context generation, recall, remember, forget, enrich, and graph queries
Security: endpoints can be protected with Authorization: Bearer tokens when AGENTMEMORY_SECRET is set
The REST API exposes a comprehensive suite for programmatic access to memory operations, session management, and graph queries
Documentation references point to the codebase for the full endpoint list (src/triggers/api.ts)

Deployment: One-Click Templates and Cloud Workflows

One-click deployment templates for managed hosts
Each template ships a self-contained Dockerfile that pulls agentmemory from npm and includes the iii engine
Persistent storage is mounted at /data; the startup script configures a deploy-tuned memory server
Numerous deployment options:
Fly.io: one-click deploy
Railway: deploy template
Render: blueprint flow with automatic disk snapshots
Coolify: self-hosted on a VPS with the same Docker Compose stack
The deployment templates ensure that only port 3113 (viewer) is exposed to the host, while internal endpoints remain secure

Why agentmemory Matters: A Practical Perspective

The fundamental problem: most coding agents forget everything at session end
The memory layer flips this paradigm by providing a robust, searchable, and versioned memory graph
The memory pipeline captures, compresses, and indexes information so the agent can recall context, test results, and design decisions
Built-in memory (CLAUDE.md, notepads, etc.) is replaced by a scalable memory graph that grows with use
The result: sessions flow more smoothly, fewer re-explain events occur, and the agent demonstrates higher recall and improved performance over time

What You Get with agentmemory

A persistent memory layer for every supported agent
A unified, scalable, and auditable memory graph
Real-time visibility into memory activity via a viewer and console
A robust MCP toolkit with 53 tools and 8 skills
A path to broader automation and multi-agent collaboration through agent-scoped memory
Local and cost-aware model options to reduce reliance on cloud services

Licensing and Credits

Agentmemory is released under the Apache-2.0 license
It is designed to be used with iii and a wide range of MCP-compliant or REST-based agents
The project emphasizes openness, reproducibility, and collaboration across the AI coding agent ecosystem

Sections and Visual Aids: Quick Reference to the Figures

Banner: The banner image (banner.png) sets the tone for persistent memory across sessions.
Quick Start and Section Headers
The section icons (section-quickstart.svg, section-benchmarks.svg, section-competitors.svg, etc.) help readers navigate the document quickly.
Real-Time Viewer and Traces
Real-Time Viewer image (assets/tags/section-viewer.svg) introduces the viewer’s capabilities.
iii console traces (assets/iii-console/traces-waterfall.png) illustrate the memory operation flow and performance visualization.
Workers and Live Stats
Agents and workers are visible in the “Workers page” image (assets/iii-console/workers.png).
Architectural Perspective
The “Powered by iii” section showcases the architecture, with a dedicated design diagram (assets/tags/section-architecture.svg).

Conclusion: A Future-Proof Memory Layer for AI Coding Agents agentmemory transforms the way developers and AI assistants collaborate. By providing a persistent, searchable, and multi-agent memory system, it eliminates the repetitive, error-prone cycle of re-explaining context and preferences. It enables memory-driven workflows, where past decisions, tests, and patterns inform future actions. All agents share the same memory, so improvements in one workflow flow through the entire ecosystem. With transparent real-time viewing, robust MCP tooling, and flexible local-model options, agentmemory is well positioned to become the standard memory layer for AI coding agents in production environments.

Appendix: Quick Reference Commands

Install and run the memory server
npm install -g @agentmemory/agentmemory
npx @agentmemory/agentmemory
Seed and demo sessions
npx @agentmemory/agentmemory demo
Connect an agent (example: Claude Code)
agentmemory connect claude-code
Global install and maintenance
npm install -g @agentmemory/agentmemory
agentmemory stop
agentmemory remove
agentmemory upgrade
Local environment and security
Create the env file at ~/.agentmemory/.env and customize keys
AGENTMEMORY_URL to point to a running server when using a local or remote host

Images throughout the post

Banner: agentmemory banner image
Demo and stats: [assets/demo.gif], [assets/tags/stat-recall.svg], [assets/tags/stat-tokens.svg], [assets/tags/stat-tools.svg], [assets/tags/stat-hooks.svg], [assets/tags/stat-deps.svg], [assets/tags/stat-tests.svg]
Real-time viewer: [assets/tags/section-viewer.svg], [assets/iii-console/workers.png], [assets/iii-console/traces-waterfall.png]
Architecture and design: [assets/tags/section-architecture.svg], [assets/tags/section-how.svg], [assets/tags/section-config.svg]

If you’d like, I can tailor this blog post to a specific audience (engineers, product managers, or platform operators) or expand any section with deeper dive subsections and more implementation details.

agentmemory — Persistent memory for AI coding agents

Enjoying this project?

GitHub - rohitg00/agentmemory: agentmemory — Persistent memory for AI coding agents

Stay Updated

Product

Learn

Company

Legal

Stay Updated

Browse by Category

Stay Updated