PilotDeck

PilotDeck: A WorkSpace-Centric Open-Source Agent Operating System for the AI-Driven Work Era
Introduction In a world where AI agents increasingly take on long-running, multi-project workloads, PilotDeck emerges as a purpose-built operating system designed around the core unit of work: the WorkSpace. Developed through a collaboration between leading research and industry partners—Tsinghua University’s THUNLP, ModelBest, OpenBMB, and AI9Stars—PilotDeck redefines how memory, tasks, and models interact in an open-source, production-grade environment. It moves beyond one-shot Q&A or single-project tooling, offering an integrated platform that supports continuous work, traceable memory, and intelligent task routing across models and services.
What PilotDeck Is and Why It Matters PilotDeck is an open-source agent platform focused on productivity for the Agent era. It treats WorkSpace as the fundamental isolation and growth boundary, ensuring that each project has its own files, memory store, and skill set—preventing cross-project interference and enabling precise control over what an AI remembers, uses, and produces. The platform’s design addresses several persistent questions in multi-project AI work:
- Memory traceability: If the AI errs, can you locate the exact memory entry responsible and edit it in place without restarting the entire conversation?
- Cost awareness: Can you track token usage per task and keep background agents economically viable?
- Model matching: Can tasks of varying difficulty automatically be assigned to different models to optimize quality and cost?
- Continuity: When you’re away, can the agent continue to progress and deliver results as local files with progress reports?
PilotDeck’s answer is delivered through three core pillars: White-box Memory, Smart Routing, and Always-on, all built around the WorkSpace concept and MCP (Model Context Protocol) compatibility. The system is designed to work consistently across front-ends—Web, CLI, and IM integrations—so you can switch interfaces without losing capabilities.
Key Highlights
- WorkSpace-Level Isolation & Accretion: Every project has its own sandboxed file system, memory store, and skill set. Parallel work stays isolated; retrieval remains bounded to the relevant WorkSpace; skills accumulate naturally as tasks grow.
- Image: WorkSpace isolation demo

- Traceable White-box Memory: Memory generation, storage, and retrieval are transparent end-to-end. When mis-remembrance happens, you can identify the offending entry and correct it. Dream Mode consolidates idle memory and supports one-click rollback.
- Image: White-box memory demo

- Smart Routing & Cost Optimization: The system auto-detects task difficulty and routes to appropriate models—flagship models for complex tasks, lighter models for simpler ones. This on-device/cloud orchestration dramatically reduces token spend while preserving quality.
- Image: Smart routing demo

- Always-on Background Execution: PilotDeck breaks the “you ask, it answers” loop. After you sign off, the agent keeps discovering tasks, monitoring long-horizon goals, and delivering final results as local files with summaries ready for you.
- Image: Always-on execution demo

Core Concepts: How the Three Pillars Work Together
- White-box Memory
- Visibility: Every memory entry and its provenance can be viewed, making memory behavior auditable and improvable.
- Control: Entries can be edited, removed, or pinned to prevent drift.
- Traceability: End-to-end traceability from memory generation to retrieval.
- Isolation: Memory lives and stays within its WorkSpace, avoiding cross-project bleed.
- Reversibility: Dream-mode enables one-click rollback to prior states.
- Visual aid: Memory management visuals help users understand what the AI remembers and why.
- Smart Routing
- Auto-detection of task difficulty allows dynamic model assignment.
- Complex tasks leverage flagship models (e.g., Claude 3.5 Sonnet, GPT-4o) while simpler subtasks use lighter sub-agents.
- On-device and cloud coordination ensures efficient use of resources without sacrificing performance.
- Demonstrations show substantial cost savings and improved outcomes across varied domains.
- Always-on
- Ongoing task discovery even when the user is not actively interacting.
- Long-horizon monitoring and maintenance of work progress.
- Deliverables are landed as local files with concise summaries, enabling a smooth handoff to humans or downstream systems.
Real-World Numbers: What the Pillars Deliver PilotDeck’s three pillars translate into measurable improvements in production workflows, particularly in cost efficiency and performance for multi-task, multi-model workloads.
1) Smart Routing — ~70% cost savings on social-media workloads
- In Xiaohongshu-style social platforms, intelligent routing demotes simple polishing and formatting tasks to sub-agents (e.g., Sonnet 4.5) and reserves Opus 4.5 for planning checkpoints.
- Cost comparison (illustrative):
- Smart Routing ON: Opus 4.5 main + Sonnet 4.5 sub — about $2.83 with a 1.1x multiplier.
- Smart Routing OFF: All Opus 4.5 (main + sub) — about $12.58, 5.0x multiplier.
- Monolithic (single model, long-running): about $12.20, 4.8x multiplier.
- The results demonstrate substantial savings without compromising practicality for real-world social-media workflows.
2) Smart Routing — 1/6 the cost while beating frontier models on hard tasks
- Benchmarking across seven complex tasks (multilingual podcasts, multi-source data reports, literature reviews, codebase docs, etc.).
- A two-model setup (main + light sub) matches or beats frontier single-model configurations at a fraction of the cost:
- MiniMax-M2.7 single-agent: score 37.1, cost $1.90
- Claude Sonnet 4.6 single-agent: score 69.1, cost $18.36
- Sonnet 4.6 (main) + MiniMax-M2.7 (sub): score 70.6, cost $3.15
- The two-model routing configuration delivers superior performance for a fraction of typical single-model costs.
3) White-box Memory — layout & tone never bleed across projects
- Black-box agents with shared memory pools risk cross-project leakage and drift.
- PilotDeck’s white-box approach ensures:
- Visibility: Every memory item is visible, with context about when and where it was stored.
- Control: Ability to edit or delete memory entries and pin crucial decisions.
- Traceability: Generation → extraction → storage → retrieval is auditable.
- Isolation: Memory is scoped per WorkSpace, ensuring A’s memory never leaks into B.
- Reversibility: Dream-mode supports rollback to a prior known-good state.
UI, Demos, and Use Cases: What you can Build with PilotDeck PilotDeck ships with a web-based UI that provides out-of-the-box WorkSpace management, memory editing, and visualization of multi-agent collaboration. Demos illustrate end-to-end results generated entirely by edge-side models via Smart Routing, without requiring cloud-side frontier models.
Use Case Highlights
- Work Document Generation
- Prompt: "Survey the Chinese LLM application market and turn it into a formal HTML white paper."
- Process and result are demonstrated through dynamic visuals.
- Visuals:
- Process animation:

- Result:

- Process animation:
- Mini-Game Development
- Prompt: "Walk me through building an iOS AR mini-game Ball Finder in Vibe Coding mode."
- Process and result showcased with visuals.
- Visuals:
- Process animation:

- Result:

- Process animation:
- AI Engineering Platform Development
- Prompt: "Build a low-code embedding fine-tuning platform from scratch."
- Visual narrative with process and final result.
- Visuals:
- Process animation:

- Result:

- Process animation:
- Audio-Video Editing & Social Media Operations
- Prompt: "Push this English podcast to a global audience in multiple languages."
- In addition to visuals, output includes an audio-enabled result.
- Visuals:
- Process animation:

- Result: a hosted asset (example link) https://github.com/user-attachments/assets/a7245467-ee3c-4939-a055-c56576ac56d1
- Process animation:
Installation & Quick Start: Getting PilotDeck Running PilotDeck provides a straightforward path to take-off, with options for one-line installs, source-based development workflows, and Docker-based deployment.
Option A: One-line install (recommended for macOS / Linux)
- Command:
- curl -fsSL https://raw.githubusercontent.com/OpenBMB/PilotDeck/main/install.sh | bash
- What it does:
- Installs Node.js 22, clones the repository, installs dependencies, and builds the frontend.
- Start and check status:
- pilotdeck
- pilotdeck status
Image references: banner and demonstration visuals appear throughout this section to illustrate the setup and interface.
Option B: From source (for developers)
- Prerequisites: Git LFS for large media assets.
- Steps:
- git clone https://github.com/OpenBMB/PilotDeck.git
- cd PilotDeck
- npm install
- cd ui && npm install
- cd ..
- Notes:
- If you don’t need demo videos, you can skip downloading big assets by setting GITLFSSKIP_SMUDGE=1 before cloning.
Option C: Docker Compose
- If you have Docker, you can run PilotDeck with:
- docker compose up -d
Option B and C come with corresponding config guidance and examples to tailor to your environment.
From Source to Runtime: Configuring Providers and Running Services
- Model providers configuration
- PilotDeck reads a YAML file at ~/.pilotdeck/pilotdeck.yaml.
- You can configure providers and endpoints through either a prepared bootstrap script or via the Web UI settings panel.
- Example YAML snippet includes model providers and protocol endpoints (OpenAI-compatible endpoints are supported by default or via DeepSeek, MiniMax, Qwen, Kimi, and others).
- Running the UI and backend
- In development:
- cd ui && npm run dev (Hot Module Replacement; visit http://localhost:5173)
- For production:
- cd ui && npm run start (visit http://localhost:3001)
Docker-based deployment is also supported to streamline setup in containerized environments.
Extension Protocol: How to Extend PilotDeck PilotDeck supports an open plugin architecture that cleanly separates the core open-source engine from plugin customizations. This design enables robust extensibility through a plugin.json mechanism.
Key extension points
- MCP Servers: First-class integration with any Model Context Protocol server.
- Tools & Skills: Register custom tools, or leverage community skills via ClawHub.
- Lifecycle Hooks: Intercept critical events such as PreToolUse and UserPromptSubmit.
- Custom Memory: Plug in your own memory store provider, enabling bespoke memory strategies.
Community and Contributions: Join the PilotDeck Collective
- Bugs and feature requests: GitHub issues page.
- Community channels:
- WeChat Community
- Feishu Community
- Discord Community
QR codes for quick access
- WeChat: QR code image

- Feishu: QR code image

- Discord: QR code image

Contributing and Roadmap PilotDeck invites contributors to participate in advancing the next generation of agent-oriented tooling. The workflow is straightforward:
- Fork the repository
- Create a feature branch
- Submit a pull request
Acknowledgements: Honoring the Pioneers and Open-Source Foundations PilotDeck acknowledges the influence and inspiration from a broad ecosystem of agent OS pioneers and open-source contributors. It builds upon and integrates with several foundational projects:
- ClawXRouter — Intelligent model routing
- ClawXMemory — Agent memory system
- Claude Code UI — Web UI reference
- Claude Code Router — Model routing reference
- UltraRAG — RAG framework
- Anthropics Skills — Agent skill framework
- VerceL Labs Skills — find-skills capability
- MiniMax-AI Skills — minimax-based skills
- Frontend tooling and UI libraries like Vite, React, Tailwind CSS, and shadcn/ui
- Community-driven skill ecosystems and vivid examples of agent-driven automations
Joint Development PilotDeck is a collaborative effort among leading institutions and organizations:
- Tsinghua University (THUNLP)
- ModelBest
- OpenBMB
- AI9Stars
Licensing and Usage Rights This project operates under the GNU Affero General Public License v3.0. The license emphasizes open access and community-driven improvements while ensuring that networked deployments remain open and auditable.
How to Get Started Today
- Explore the official website and live demo to see PilotDeck in action:
- Website: pilotdeck.openbmb.cn
- Live Demo: pilotdeck.openbmb.cn/pilotdeck.github.io/demo/p/pilotdeck-demo
- Check the documentation for introductions, tutorials, and quick starts:
- Documentation: pilotdeck.openbmb.cn/pilotdeck.github.io/docs/en/introduction
- Join the community channels and start experimenting with WorkSpaces, memory management, and smart routing
- WeChat and Feishu QR codes are available in the community section
- Discord channel for real-time discussions and collaboration
A Word on the Vision PilotDeck aspires to redefine productivity in the Agent era by normalizing long-running, multi-project AI workflows that are auditable, cost-aware, and resilient. By isolating WorkSpaces, making memory traceable, routing intelligently between model capabilities, and enabling always-on operation, PilotDeck seeks to transform how teams orchestrate AI-assisted work across domains.
Images and Visual Narrative Across the blog post, images serve as visual anchors for the concepts:
- The banner introduction captures the brand and mission

- WorkSpace isolation demonstrates how projects stay distinct

- White-box memory visuals illustrate memory traceability and debugging

- Smart routing animations embody dynamic model assignment

- Always-on execution visuals show ongoing background activity

- Case-focused demos showcase end-to-end results
- Work Document Generation:

- Work Document Generation results:

- Mini-Game Development:

- Mini-Game Development results:

- AI Engineering Platform Development:

- AI Engineering Platform Development results:

- Podcast & Audio/Video Editing:

- Community QR codes for one-tap access
- WeChat:

- Feishu:

- Discord:

Closing Reflection PilotDeck is more than a collection of features; it represents a shift toward a more disciplined, transparent, and scalable way to manage AI-driven work. By giving teams a robust framework to manage memory, route tasks intelligently, and keep work progressing in the background, PilotDeck aims to make AI-assisted productivity practical, economical, and auditable in real-world settings. If you’re building multi-project AI workflows, PilotDeck offers a compelling path forward—an open-source platform that invites collaboration, experimentation, and shared progress toward a more capable agent-enabled future.
Enjoying this project?
Discover more amazing open-source projects on TechLogHub. We curate the best developer tools and projects.
Repository:https://github.com/OpenBMB/PilotDeck
GitHub - OpenBMB/PilotDeck: PilotDeck
PilotDeck is an open-source agent platform focused on productivity for the Agent era, treating WorkSpace as the fundamental isolation and growth boundary....
github - openbmb/pilotdeck