OpenCLI: Deterministic Interfaces for Humans and AI Agents
- Overview
- OpenCLI is a unifying platform that turns websites, browser sessions, Electron apps, and local tools into deterministic interfaces for both humans and AI agents. It reuses your logged-in browser state to automate live workflows and crystallize repeated actions into reusable CLI commands.
- The system provides a single surface capable of three different kinds of automation:
- Built-in adapters for popular sites such as Bilibili, Zhihu, Xiaohongshu, Reddit, Hacker News, Twitter/X, and many more.
- Live browser driving via opencli browser, allowing AI agents to click, type, extract, and inspect pages in real time.
- Methods to generate new adapters from actual browser behavior using explore, synthesize, generate, and cascade, while also serving as a CLI hub for local tools (gh, docker, and other binaries you register) and desktop app adapters for Electron apps.
- Visual highlights include a desktop-app control surface, direct browser automation, deterministic CLI conversion of websites, account-safe session reuse, agent-ready capabilities, and a hub for external CLIs. A key promise is zero runtime token cost and deterministic outputs for CI-friendly pipelines.
- The OpenCLI ecosystem is designed to be extensible, enabling beyond-website automation: you can integrate local binaries, command-line tools, and Electron-based desktop apps, each with its own documented adapters and control surfaces.
- Key Features and Capabilities
- Built-in adapters for 87+ sites and commands
- These adapters provide stable, deterministic commands for well-known sites, with the option to explore and extend coverage for new sites using a generation workflow.
- Example sites include major social platforms, content aggregators, forums, shopping portals, and media services.
- Browser-driven automation (live control)
- The browser surface lets AI agents interact with pages as a human would: clicking, typing, selecting, waiting for elements, capturing data, and rendering screenshots.
- Commands exposed for browser actions include open, state, click, type, select, keys, wait, get, screenshot, scroll, back, eval, network, init, verify, and close.
- Website-to-CLI conversion
- OpenCLI can turn any website into a deterministic CLI command, offering a path from exploratory browser behavior to reusable, scriptable actions.
- The platform ships with many adapters out of the box and supports generating new adapters from real browser activity via dedicated workflows.
- Account-safe operation
- OpenCLI reuses your existing Chrome/Chromium login state, ensuring credentials do not leave the browser session and reducing the risk of credential leakage.
- AI agent readiness
- The agent ecosystem uses a tiered approach: explore discovers APIs and capabilities, synthesize turns exploration artifacts into adapters, cascade helps discover authentication strategies, and browser commands offer direct control.
- CLI hub for local tools
- OpenCLI discovers, auto-installs, and passes through commands to external CLIs such as gh, docker, obsidian, and others, creating a single discovery surface.
- Desktop app adapters
- Control Electron apps (Cursor, Codex, Antigravity, ChatGPT, Notion, and more) from the terminal via a CDP-backed bridge.
- Zero LLM cost
- Runtime usage does not incur token costs; heavy automation can run thousands of times without token-based billing.
- Deterministic outputs
- The same command yields the same output schema every time, making pipelines, scripts, and CI integrations reliable and predictable.
- Quick Start Guide
- 1. Install OpenCLI
- Command: npm install -g @jackwener/opencli
- 2. Install the Browser Bridge extension
- OpenCLI connects to Chrome/Chromium through a lightweight Browser Bridge extension plus a small local daemon.
- Steps:
- Download the latest opencli-extension-v{version}.zip from the GitHub Releases page.
- Unzip, go to chrome://extensions, enable Developer mode.
- Click Load unpacked and select the unzipped folder.
- 3. Verify the setup
- Command: opencli doctor
- 4. Run your first commands
- Examples:
- opencli list
- opencli hackernews top --limit 5
- opencli bilibili hot --limit 5
- For Humans: How to Use OpenCLI Directly
- The primary commands surface a stable and predictable experience:
- opencli list: reveals every registered command and adapter available.
- opencli
…: runs a built-in or generated adapter. - opencli register mycli: exposes a local CLI through the same discovery surface.
- opencli doctor: diagnostics for browser connectivity and setup health.
- The human workflow emphasizes reliability and repeatability, enabling operators to script actions for testing, data gathering, and automation without re-learning each site’s quirks.
- The design intent is to let humans build consistent routines quickly, while AI agents gain a unified interface to operate across websites, apps, and tools.
- For AI Agents: How to Leverage OpenCLI
- Entry points vary by task and capability:
- The explorer skill (skills/opencli-explorer) covers the creation of new adapters and supports fully automated generation and manual exploration workflows.
- The browser skill (skills/opencli-browser) provides a low-level surface for live browsing, debugging, and intervention.
- Quick-start automation for agents:
- Start with opencli-explorer when you need a reusable command for a site.
- Use opencli-browser when the agent must inspect or steer a page directly.
- Packaged skills to install
- Install all needed components:
- npx skills add jackwener/opencli
- Or install selectively:
- npx skills add jackwener/opencli --skill opencli-usage
- npx skills add jackwener/opencli --skill opencli-browser
- npx skills add jackwener/opencli --skill opencli-explorer
- npx skills add jackwener/opencli --skill opencli-oneshot
- Example usage patterns
- opencli explore https://example.com --site mysite
- opencli synthesize mysite
- opencli generate https://example.com --goal "hot"
- opencli cascade https://api.example.com/data
- Core browser command capabilities
- open, state, click, type, select, keys, wait, get, screenshot, scroll, back, eval, network, init, verify, close
- Practical guidance
- Start with opencli-explorer for reusable site commands.
- Switch to opencli-browser when page inspection or direct manipulation is needed.
- Core Concepts and Architecture
- browser: live control
- Use when the task is inherently interactive and the agent must operate a page directly.
- Built-in adapters: stable commands
- For sites where deterministic capabilities exist, use site-specific commands like hackernews top or reddit hot.
- explore / synthesize / generate: create new CLIs
- explore: inspects a page, network activity, and capability surface.
- synthesize: converts exploration artifacts into evaluate-based JavaScript adapters.
- generate: completes the path with a verified adapter or clear human- review explanation.
- cascade: auth strategy discovery
- Probes fallback authentication paths such as public endpoints, cookies, and custom headers before finalizing an adapter design.
- CLI Hub and desktop adapters
- OpenCLI serves as a universal hub for local CLIs (gh, docker, obsidian, etc.) and provides CDP-backed control of Electron-based desktop apps.
- Prerequisites
- Node.js version >= 21.0.0 (or Bun >= 1.0).
- Chrome/Chromium running and logged into the target site for browser-backed commands.
- Configuration and Environment
- The configuration layer uses environment variables to tune performance, debugging, and behavior:
- OPENCLIDAEMONPORT: 19825 by default; HTTP port for the daemon-extension bridge.
- OPENCLIWINDOWFOCUSED: false by default; set to 1 to force automation windows to foreground for debugging.
- OPENCLIBROWSERCONNECT_TIMEOUT: 30 seconds to wait for browser connection.
- OPENCLIBROWSERCOMMAND_TIMEOUT: 60 seconds for a single browser command.
- OPENCLIBROWSEREXPLORE_TIMEOUT: 120 seconds for explore/record operations.
- OPENCLICDPENDPOINT: The Chrome DevTools Protocol endpoint for remote browsers or Electron apps.
- OPENCLICDPTARGET: Optional URL substring to filter CDP targets.
- OPENCLI_VERBOSE: false by default; enables verbose logs with -v as an alternative.
- OPENCLI_DIAGNOSTIC: false by default; capture structured diagnostic context on failures.
- DEBUG_SNAPSHOT: 0 by default; enable DOM snapshot debug output when set.
- Update and installation guidance
- For a fresh install or upgrade:
- npm install -g @jackwener/opencli@latest
- npx skills add jackwener/opencli
- You can refresh only the skills you actually use:
- npx skills add jackwener/opencli --skill opencli-usage
- npx skills add jackwener/opencli --skill opencli-browser
- npx skills add jackwener/opencli --skill opencli-explorer
- npx skills add jackwener/opencli --skill opencli-oneshot
- Built-in Commands and Adapters
- The system ships with a broad catalog of built-in commands mapped to specific sites:
- xiaohongshu: search, note, comments, feed, user, download, publish, notifications, creator-notes, creator-notes-summary, creator-note-detail, creator-profile, creator-stats
- bilibili: hot, search, history, feed, ranking, download, comments, dynamic, favorite, following, me, subtitle, user-videos
- tieba: hot, posts, search, read
- hupu: hot, search, detail, mentions, reply, like, unlike
- twitter: trending, search, timeline, lists, bookmarks, post, download, profile, article, like, likes, notifications, reply, reply-dm, thread, follow, unfollow, followers, following, block, unblock, bookmark, unbookmark, delete, hide-reply, accept
- reddit: hot, frontpage, popular, search, subreddit, read, user, user-posts, user-comments, upvote, upvoted, save, saved, comment, subscribe
- zhihu: hot, search, question, download, follow, like, favorite, comment, answer
- amazon: bestsellers, search, product, offer, discussion, movers-shakers, new-releases
- 1688: search, item, assets, download, store
- gitee: trending, search, user
- gemini: new, ask, image, deep-research, deep-research-result
- yuanbao: new, ask
- notebooklm: status, list, open, current, get, history, summary, note-list, notes-get, source-list, source-get, source-fulltext, source-guide
- spotify: auth, status, play, pause, next, prev, volume, search, queue, shuffle, repeat
- xianyu: search, item, chat
- xiaoe: courses, detail, catalog, play-url, content
- quark: ls, mkdir, mv, rename, rm, save, share-tree
- uiverse: code, preview
- nowcoder: hot, trending, topics, recommend, creators, companies, jobs, search, suggest, experience, referral, salary, papers, practice, notifications, detail
- xiaoyuzhou: podcast, podcast-episodes, episode, download, transcript
- The collection totals to 87+ adapters, illustrating broad coverage. See the adapters index for the complete list: ./docs/adapters/index.md
- CLI Hub: external CLIs such as gh, obsidian, docker, lark-cli, dingtalk, wecom, vercel, etc., can be discovered and invoked through OpenCLI. If a tool isn’t installed, OpenCLI can auto-install it before re-running the command.
- Local CLI registration: add new CLIs by registering them with opencli register mycli, so AI agents can discover and execute them via the same surface.
- Desktop App Adapters
- OpenCLI does more than websites; it provides adapters to control Electron desktop apps via CDP:
- Cursor: control Cursor IDE features like Composer, chat, and code extraction. Doc: ./docs/adapters/desktop/cursor.md
- Codex: drive OpenAI Codex CLI agent headlessly. Doc: ./docs/adapters/desktop/codex.md
- Antigravity: control Antigravity Ultra from the terminal. Doc: ./docs/adapters/desktop/antigravity.md
- ChatGPT App: automate the macOS ChatGPT desktop app. Doc: ./docs/adapters/desktop/chatgpt-app.md
- ChatWise: multi-LLM client (GPT-4, Claude, Gemini). Doc: ./docs/adapters/desktop/chatwise.md
- Notion: search, read, write Notion pages. Doc: ./docs/adapters/desktop/notion.md
- Discord: Discord Desktop—messages, channels, servers. Doc: ./docs/adapters/desktop/discord.md
- Doubao: control Doubao AI desktop app via CDP. Doc: ./docs/adapters/desktop/doubao-app.md
- To add a new Electron app adapter, you should start with the electron-app-cli guide available at docs/guide/electron-app-cli.md.
- Download Support (Media and Content)
- OpenCLI supports downloading images, videos, and articles from supported platforms. Content types and notes:
- Xiaohongshu: images and videos; downloads all media from a note.
- bilibili: videos; requires yt-dlp installed.
- Twitter: images and videos; from user media tab or a single tweet.
- Douban: images; poster/stills lists.
- Pixiv: images; original-quality illustrations, multi-page.
- 1688: images and videos; downloads page-visible product media.
- Xiaoyuzhou: audio and transcript; downloads episode audio and transcript JSON/text with local credentials.
- Zhihu: articles (Markdown); exports with optional image download.
- Weixin: articles (Markdown); WeChat Official Account content.
- For video downloads on some sites, ensure yt-dlp is installed (e.g., using brew install yt-dlp).
- Usage examples:
- opencli xiaohongshu download "
" --output ./xhs - opencli bilibili download BV1xxx --output ./bilibili
- opencli twitter download elonmusk --limit 20 --output ./twitter
- opencli 1688 download 841141931191 --output ./1688-downloads
- opencli xiaoyuzhou download 69b3b675772ac2295bfc01d0 --output ./xiaoyuzhou
- opencli xiaoyuzhou transcript 69dd0c98e2c8be31551f6a33 --output ./xiaoyuzhou-transcripts
- Note: Xiaoyuzhou download and transcript require local Xiaoyuzhou credentials stored at ~/.opencli/xiaoyuzhou.json.
- Output Formats and Pipelines
- All built-in commands support format options: --format or -f
- Supported formats: table (default), json, yaml, md, csv
- Examples:
- opencli bilibili hot -f json
- opencli bilibili hot -f csv
- opencli bilibili hot -v
- These formats enable easy piping into jq, LLMS, or downstream automation.
- Exit codes
- OpenCLI follows Unix sysexits.h conventions to integrate with shell pipelines and CI:
- 0: Success
- 1: Generic error
- 2: Usage error
- 66: Empty result (EX_NOINPUT)
- 69: Service unavailable (EX_UNAVAILABLE)
- 75: Temporary failure (EX_TEMPFAIL)
- 77: Auth required (EX_NOPERM)
- 78: Config error (EX_CONFIG)
- 130: Interrupted (SIGINT)
- Troubleshooting
- “Extension not connected”: ensure the Browser Bridge extension is installed and enabled.
- “attach failed: Cannot access a chrome-extension:// URL”: another extension might be interfering; try disabling others.
- “Empty data or Unauthorized”: login session may have expired; re-authenticate on target site.
- Node API errors: ensure Node.js is >= 21; some features require node:util styleText.
- Daemon issues: check status (curl localhost:19825/status) and logs (curl localhost:19825/logs).
- Plugins and Community Extensions
- OpenCLI can be extended with community-contributed adapters (plugins):
- opencli-plugin-github-trending: GitHub Trending repositories
- opencli-plugin-hot-digest: Multi-platform trending aggregator
- opencli-plugin-juejin: Juejin hot articles
- opencli-plugin-vk: VK wall, feed, and search
- Plugins are managed through commands like:
- opencli plugin install github:user/opencli-plugin-my-tool
- opencli plugin list
- opencli plugin update --all
- opencli plugin uninstall my-tool
- Plugins Guide
- See the Plugins Guide at docs/guide/plugins.md for creating your own plugin ecosystem.
- For AI Agents: Developer Guide and Modes
- Quick mode
- Generate a single command for a specific page URL (opencli-oneshot skill):
- URL + one-line goal, four steps completed.
- Full mode
- Before writing adapter code, consult the opencli-explorer skill for a complete browser exploration workflow, authentication strategy decision tree, and debugging guide.
- Practical commands for agent workflows
- opencli explore https://example.com --site mysite
- opencli synthesize mysite
- opencli generate https://example.com --goal "hot"
- opencli cascade https://api.example.com/data
- Reusable workflow pattern
- Exploration to synthesis to generation creates a robust path from discovery to deployment, enabling agents to evolve from ad-hoc interactions to stable adapters.
- Testing and Troubleshooting
- Testing guidance is provided in TESTING.md, with instructions on how to run tests and how to write tests for adapters and workflows.
- Troubleshooting quick references:
- Ensure the Browser Bridge extension is installed and enabled.
- If attach failures persist, temporarily disable other extensions.
- Re-authenticate on the target site if data is empty or access is denied.
- Ensure Node.js is up-to-date (>= 21) for compatibility with modern APIs.
- Check the daemon and browser bridge connectivity using status and logs endpoints.
- Visual History and Community Momentum
- OpenCLI is an actively evolving project with community contributions and ongoing enhancements.
- Star History visualization
- The project’s momentum can be observed via a Star History chart, illustrating the growth of OpenCLI over time:
- Licensing and Ownership
- OpenCLI is licensed under the Apache-2.0 license.
- The project repository and its assets include a mix of OpenCLI core, adapters, and extension components licensed to foster collaboration and reuse.
- Summary of What OpenCLI Delivers
- A single, deterministic, and scriptable surface for automation that spans websites, live browsers, Electron apps, and local desktops.
- A rich catalog of 87+ adapters plus the ability to generate new adapters via exploration and synthesis.
- A robust CLI hub that makes it easy to reuse and orchestrate external CLIs and desktop adapters.
- A strong emphasis on security with login-state reuse and credential isolation within the browser.
- An AI-ready architecture where agents can explore, synthesize, and cascade capabilities to build reliable automation pipelines.
- Flexible output formats and CI-friendly exit codes, enabling confident integration into larger automation stacks.
- Visuals and Resources
- The header badges illustrate the project’s accessibility and status:
- Documentation in Chinese
- NPM package availability
- Node.js version compatibility
- OpenCLI licensing
- A star history visualization showcases community engagement and adoption momentum:
- Final Notes
- OpenCLI presents a unified path to automate human workflows and AI agent tasks across heterogeneous surfaces—websites, browser sessions, Electron apps, and local binaries—without sacrificing reliability or reproducibility.
- By blending built-in adapters, live browser control, and generation-powered expansions, it offers a scalable approach to deterministic command execution across the entire software stack.
- For developers, users, and AI agents alike, the system provides a coherent, instrumented way to turn complex, interactive tasks into repeatable, testable CLI commands.
Images included from the input:
- Chinese documentation badge
- NPM badge
- Node.js version badge
- License badge
- Star History visualization
Note: This description is a narrative synthesis of the provided OpenCLI input, organized into sections and bullet/numbered lists to satisfy the requested structure. It preserves the core ideas, capabilities, and workflows described in the input while expanding on usage patterns, developer guidance, and practical considerations.
Enjoying this project?
Discover more amazing open-source projects on TechLogHub. We curate the best developer tools and projects.
Repository:https://github.com/jackwener/OpenCLI
GitHub - jackwener/OpenCLI: OpenCLI: Deterministic Interfaces for Humans and AI Agents
OpenCLI is a unifying platform that turns websites, browser sessions, Electron apps, and local tools into deterministic interfaces for both humans and AI agents...
github - jackwener/opencli