agentmemory — Persistent memory for AI coding agents
Introduction: A New Kind of Coding Memory agentmemory banner image
In the fast-evolving world of AI coding agents, memory matters more than ever. agentmemory is a persistent memory layer designed for AI coding agents, so your agent remembers everything between sessions. It eliminates the perennial re-explanation loop and keeps your preferences, bugs, and architectural decisions alive across sessions. Built on the iii engine, agentmemory provides a unified memory server that can be wired to Claude Code, GitHub Copilot CLI, Cursor, Gemini CLI, Codex CLI, Hermes, OpenClaw, pi, OpenHuman, OpenCode, and many MCP clients. In short: one memory server, many agents, limitless continuity.
Banner and quick navigation
- This project is powered by iii and ships with a full suite of integration guides, quick starts, and real-time viewing tools.
- The banner image above sets the stage for a memory-driven approach to coding assistants.
Overview: What agentmemory Does
- Persistent memory across sessions: agentmemory quietly captures what your agent does, compresses it into searchable memory, and injects the right context when the next session starts. No re-teaching needed.
- Cross-agent memory sharing: one memory server serves every agent that supports MCP or REST API. Memories are shared across all wired tools and clients.
- Twelve hooks, 53 MCP tools, and eight skills: agentmemory comes with a rich toolkit that lets agents remember, recall, and reason about past work without manual intervention.
- Real-time visibility: a built-in viewer and iii console provide live insight into what the agent remembers and how it processes memory operations.
Images that illustrate the vision
- Banner: agentmemory banner image (banner.png) reinforces the idea of persistent memory.
- Demo and statistics: a quick glance at the demo, retrieval metrics, and tooling counts show practical benefits.
- Real-time view: the live viewer and console images illustrate how memory is captured, indexed, and accessed.
Works with Every Agent: A Universal Memory Layer agentmemory is designed to work with any agent that supports hooks, MCP, or REST API. It presents a single memory server that all compatible agents can share.
Key compatible agents and integrations
- Claude Code: native plugin with 12 hooks and MCP support for seamless memory integration.
- Codex CLI: native plugin that exposes a comprehensive set of memory tools via MCP and plugin hooks.
- GitHub Copilot CLI: MCP-based integration with a plugin that wires Copilot to the memory server.
- OpenClaw: native plugin with MCP support and memory tools wired in.
- Hermes: native plugin with memory tooling integrated.
- pi: native plugin with a memory-backed workflow.
- OpenHuman: a Memory trait backend that connects to the memory server.
- Cursor: MCP server that connects to agentmemory for live memory and recall.
- Gemini CLI: memory-capable MCP server connection for rapid recall.
- OpenCode: 22 hooks plus MCP and plugin support to tap into memory.
- Cline: MCP server integration for memory-backed sessions.
- Goose, Kilo Code, Aider, Windsurf, Roo Code, Warp, and more: broad coverage across many agents, with standard MCP blocks and plugin wiring.
- OpenCode, Warp, and others: supports standard MCP blocks and plugin-based extension.
What changes with agentmemory
- You set up JWT authentication in a session; in subsequent sessions, the memory of the session context, tests, and preferences is already present.
- The system supports agent isolation modes via AGENTMEMORYAGENTSCOPE, enabling multi-agent setups where memory is scoped by agent identity.
- Memory is persistent, structured, and searchable across sessions, so the agent can recall history, test results, and preferences without re-teaching.
Quick Start: Try It in 30 Seconds
- Install the memory server globally:
- npm install -g @agentmemory/agentmemory
- Start the memory server and seed data:
- npx @agentmemory/agentmemory
- npx @agentmemory/agentmemory demo
- Wire a target agent (for example Claude Code):
- agentmemory connect claude-code
- Optional: install native skills to empower your agent with tool-use awareness:
- npx skills add rohitg00/agentmemory -y
- Alternative: use npx for a quick run without global install:
- npx @agentmemory/agentmemory
- Visit the real-time viewer at http://localhost:3113 to watch memory indexing in action.
Important setup notes
- If you used an older release with npx, you may need to force the latest version:
- npx -y @agentmemory/agentmemory@latest
- If you run into caching issues on macOS/Linux, clearing the npx cache can help:
- rm -rf ~/.npm/_npx
Session Replay and Memory Lifecycle
- Every session you record is replayable in the viewer. The Replay tab lets you scrub through prompts, tool calls, and tool results, with play/pause, speed control, and keyboard shortcuts.
- Import existing Claude Code JSONL transcripts to the viewer to enrich memory with prior work.
- The memory lifecycle features four tiers of consolidation: Working, Episodic, Semantic, and Procedural memory. This mirrors human memory: raw observations, session summaries, extracted patterns, and repeatable workflows.
Benchmarks and Performance Highlights
- Retrieval accuracy and recall
- In-house experiments show a strong retrieval signal for historical coding work, with high precision in top matches.
- The agentmemory hybrid approach achieves high recall and precise top-K matching, improving the usefulness of retrieved memories for subsequent sessions.
- Token efficiency and cost
- Agentmemory demonstrates significant token savings by compressing and indexing memories rather than pushing raw context into every prompt.
- Local embeddings and BM25-based retrieval complement dense embeddings to provide robust performance with lower operational costs.
- Integrations and test coverage
- The system ships with more than 50 native agents via the skills framework, and 53 MCP tools when connected to a live memory server.
- A comprehensive battery of tests (950+ tests) ensures stability across MPC interactions and memory operations.
vs Competitors: Why AgentMemory Stands Out
- agentmemory combines a memory engine with an MCP server, offering a unified memory layer that works across all agents and adapters.
- It delivers higher R@5 retrieval rates and more complete cross-agent memory sharing, compared with alternatives that offer only partial memory or rely on manual memory editing.
- The four-tier memory consolidation model (Working, Episodic, Semantic, Procedural) provides a richer and more adaptable memory structure than simple sticky notes or single-context memories.
- Real-time viewer and iii console provide comprehensive observability into memory operations, traces, and workflows.
How It Works: The Memory Pipeline
- Memory pipeline overview
- PostToolUse hook triggers - deduplication with SHA-256, privacy filtering to strip secrets, and storage of raw observations.
- Compression and extraction of structured facts, concepts, and narrative.
- Embedding creation using multiple providers (local and external) and indexing via BM25 and vector search.
- Session-level summarization and knowledge-graph extraction when enabled.
- SessionStart hook loading project profiles and attacking a hybrid search space (BM25 + vector + graph) for optimal recall.
- The 4-tier consolidation model
- Working: Real-time raw observations from tool usage.
- Episodic: Compressed session summaries.
- Semantic: Extracted facts and patterns.
- Procedural: Workflows and decision patterns.
- Privacy-first approach
- Secrets and keys are stripped before storage; sensitive data is protected while retaining useful memory signals.
Key Capture Points: What gets stored
- SessionStart: project path, session identifier.
- UserPromptSubmit: user prompts (privacy-filtered).
- PreToolUse: file access patterns and enriched context.
- PostToolUse: tool name, inputs, outputs.
- PostToolUseFailure: error context.
- PreCompact: memory re-injection before compaction.
- SubagentStart/Stop: sub-agent lifecycles.
- Stop: end-of-session summary.
- SessionEnd: overall session completion signal.
MCP Server: The 53-Tool Memory Toolkit
- 53 tools, 6 resources, 3 prompts, and 8 skills
- Core tools (always available): a core set including memoryrecall, memorysave, memorysmartsearch, memorysessions, memoryexport, memoryaudit, memorygovernance_delete, and more
- Extended tools (when connected to a live memory server): graphquery, memoryconsolidate, memoryClaudebridgesync, memoryteamshare, memoryteamfeed, memorysnapshotcreate, memoryactioncreate, memoryaction_update, and many more
- Standalone MCP: a lightweight shim for agents that cannot reach the host, enabling a consistent MCP interface with a local fallback set of tools
- Cross-agent sharing: MCP enables a shared memory layer so that multiple agents can benefit from a single, persistent memory store
Real-Time Viewer and iii Console: Observability in Action
- Real-Time Viewer
- Auto-starts on port 3113; live memory stream, session explorer, memory browser, and knowledge graph visualization
- The REST-served endpoint follows Bearer-token rules when an agent memory secret is present
- iii Console
- The console provides a live, architectural view of memory operations, including traces, flows, and function invocations
- Two windows: one for memory state and one for the iii engine, making it easy to debug memory operations
- The console supports flowing through memorysmartsearch, BM25 scans, embedding lookups, and re-ranker processes in a waterfall visualization
- The Workers page shows connected workers, including agentmemory, with metadata such as PID and runtime
- The Traces page provides a waterfall or flame view for per-span durations
- Visual cues
- The console and viewer work together to provide a complete picture of memory operations and their impact on agent behavior
Powered by iii: Architecture That Replaces Traditional Stack Components
- agentmemory is already a running iii instance
- Three primitives define the runtime: worker, function, trigger
- iii’s ecosystem replaces conventional stacks (Express/Fastify, Redis, Postgres, etc.) with unified primitives
- Observability and state management come from iii-state, iii-stream, and iii-observability workers
- Extending agentmemory is as simple as adding an iii worker, for example:
- iii worker add iii-pubsub
- iii worker add iii-cron
- iii worker add iii-queue
- iii worker add iii-observability
- iii worker add iii-sandbox
- iii worker add iii-database
- iii worker add mcp
- This approach means you don’t need separate monitoring, queues, or database layers—the iii components handle it all
Configuration and Local Models: Flexible, Local, and Cost-Aware
- LLM providers
- agentmemory auto-detects providers and supports a local no-LLM mode by default
- Local models (Ollama, LM Studio, vLLM, or llama.cpp) can be used to run compression and reasoning on-device
- Local models and selection
- The guide includes recommended models for memory tasks: qwen2.5-coder:7b, llama3.2:3b, mistral:7b-instruct, and others
- Cost-aware model selection helps balance performance and price for memory compression and reasoning
- Embeddings
- Local embeddings are shipped with the system via @xenova/transformers for offline operation
- Options include Gemini embeddings, OpenAI embeddings, and other providers depending on configuration
- AGENTMEMORY settings
- AGENTMEMORYURL and AGENTMEMORYSECRET control remote access and security
- AGENTMEMORY_TOOLS controls surface (core vs all 53 tools)
- Graph extraction and consolidation toggles allow you to tune memory behavior
- Multi-agent memory with AGENT_ID
- AGENTID and AGENTMEMORYAGENT_SCOPE provide two modes: shared (default) or isolated for strict separation
- Isolated mode filters recall results by agent tag to prevent cross-agent leakage
API and Endpoints: A Rich REST Surface
- 125 endpoints on port 3111
- Key endpoints include health checks, session management, memory capture, hybrid search, context generation, recall, remember, forget, enrich, and graph queries
- Security: endpoints can be protected with Authorization: Bearer tokens when AGENTMEMORY_SECRET is set
- The REST API exposes a comprehensive suite for programmatic access to memory operations, session management, and graph queries
- Documentation references point to the codebase for the full endpoint list (src/triggers/api.ts)
Deployment: One-Click Templates and Cloud Workflows
- One-click deployment templates for managed hosts
- Each template ships a self-contained Dockerfile that pulls agentmemory from npm and includes the iii engine
- Persistent storage is mounted at /data; the startup script configures a deploy-tuned memory server
- Numerous deployment options:
- Fly.io: one-click deploy
- Railway: deploy template
- Render: blueprint flow with automatic disk snapshots
- Coolify: self-hosted on a VPS with the same Docker Compose stack
- The deployment templates ensure that only port 3113 (viewer) is exposed to the host, while internal endpoints remain secure
Why agentmemory Matters: A Practical Perspective
- The fundamental problem: most coding agents forget everything at session end
- The memory layer flips this paradigm by providing a robust, searchable, and versioned memory graph
- The memory pipeline captures, compresses, and indexes information so the agent can recall context, test results, and design decisions
- Built-in memory (CLAUDE.md, notepads, etc.) is replaced by a scalable memory graph that grows with use
- The result: sessions flow more smoothly, fewer re-explain events occur, and the agent demonstrates higher recall and improved performance over time
What You Get with agentmemory
- A persistent memory layer for every supported agent
- A unified, scalable, and auditable memory graph
- Real-time visibility into memory activity via a viewer and console
- A robust MCP toolkit with 53 tools and 8 skills
- A path to broader automation and multi-agent collaboration through agent-scoped memory
- Local and cost-aware model options to reduce reliance on cloud services
Licensing and Credits
- Agentmemory is released under the Apache-2.0 license
- It is designed to be used with iii and a wide range of MCP-compliant or REST-based agents
- The project emphasizes openness, reproducibility, and collaboration across the AI coding agent ecosystem
Sections and Visual Aids: Quick Reference to the Figures
- Banner: The banner image (banner.png) sets the tone for persistent memory across sessions.
- Quick Start and Section Headers
- The section icons (section-quickstart.svg, section-benchmarks.svg, section-competitors.svg, etc.) help readers navigate the document quickly.
- Real-Time Viewer and Traces
- Real-Time Viewer image (assets/tags/section-viewer.svg) introduces the viewer’s capabilities.
- iii console traces (assets/iii-console/traces-waterfall.png) illustrate the memory operation flow and performance visualization.
- Workers and Live Stats
- Agents and workers are visible in the “Workers page” image (assets/iii-console/workers.png).
- Architectural Perspective
- The “Powered by iii” section showcases the architecture, with a dedicated design diagram (assets/tags/section-architecture.svg).
Conclusion: A Future-Proof Memory Layer for AI Coding Agents agentmemory transforms the way developers and AI assistants collaborate. By providing a persistent, searchable, and multi-agent memory system, it eliminates the repetitive, error-prone cycle of re-explaining context and preferences. It enables memory-driven workflows, where past decisions, tests, and patterns inform future actions. All agents share the same memory, so improvements in one workflow flow through the entire ecosystem. With transparent real-time viewing, robust MCP tooling, and flexible local-model options, agentmemory is well positioned to become the standard memory layer for AI coding agents in production environments.
Appendix: Quick Reference Commands
- Install and run the memory server
- npm install -g @agentmemory/agentmemory
- npx @agentmemory/agentmemory
- Seed and demo sessions
- npx @agentmemory/agentmemory demo
- Connect an agent (example: Claude Code)
- agentmemory connect claude-code
- Global install and maintenance
- npm install -g @agentmemory/agentmemory
- agentmemory stop
- agentmemory remove
- agentmemory upgrade
- Local environment and security
- Create the env file at ~/.agentmemory/.env and customize keys
- AGENTMEMORY_URL to point to a running server when using a local or remote host
Images throughout the post
- Banner: agentmemory banner image
- Demo and stats: [assets/demo.gif], [assets/tags/stat-recall.svg], [assets/tags/stat-tokens.svg], [assets/tags/stat-tools.svg], [assets/tags/stat-hooks.svg], [assets/tags/stat-deps.svg], [assets/tags/stat-tests.svg]
- Real-time viewer: [assets/tags/section-viewer.svg], [assets/iii-console/workers.png], [assets/iii-console/traces-waterfall.png]
- Architecture and design: [assets/tags/section-architecture.svg], [assets/tags/section-how.svg], [assets/tags/section-config.svg]
If you’d like, I can tailor this blog post to a specific audience (engineers, product managers, or platform operators) or expand any section with deeper dive subsections and more implementation details.
Enjoying this project?
Discover more amazing open-source projects on TechLogHub. We curate the best developer tools and projects.
Repository:https://github.com/rohitg00/agentmemory
GitHub - rohitg00/agentmemory: agentmemory — Persistent memory for AI coding agents
Agentmemory is a persistent memory layer designed for AI coding agents, enabling seamless continuity across sessions....
github - rohitg00/agentmemory
