OpenRouter CLI Tools Top 10
June 2026 Leaderboard
If your team still picks coding agents from GitHub star counts instead of OpenRouter's 7-day CLI client throughput, you are optimizing for marketing, not production. This article is for developers and platform leads evaluating terminal-first AI workflows. We analyze the week of June 2–8, 2026—where Kilo Code leads the CLI category at 1.22 trillion tokens, Claude Code holds second at 606 billion, and Hermes Agent drives 4.94 trillion as the platform's overall #1 client—and deliver a feature matrix, Mac rental hardware tiers, scenario routing table, and a five-step rented-Mac trial protocol.
Table of Contents
01 · Why billing data beats CLI hype
GitHub READMEs promise "autonomous coding." OpenRouter's CLI client rankings show what developers actually route through paid API keys when nobody is watching. A terminal agent that burns tokens on failed tool loops still counts toward the weekly leaderboard—and your invoice. That makes the CLI filter on openrouter.ai/rankings a sharper signal than any Product Hunt launch.
OpenRouter aggregates 300+ models from 60+ providers, processing on the order of 100 trillion tokens per month across more than 8 million users. When Kilo Code climbs to the CLI #1 slot, it means thousands of independent teams chose that client as their default gateway—not that a single vendor ran a promotion. This complements our earlier analysis of weekly model token rankings, which focused on which models absorb traffic; here we focus on which CLI clients generate it.
The distinction matters because model routing and client choice are separate decisions. You might route through DeepSeek V4 Flash for cost while using Kilo Code as the orchestration layer—or run Claude Code against Anthropic's native API with zero OpenRouter involvement. The CLI leaderboard captures the first pattern: clients that explicitly proxy through OpenRouter's unified endpoint and therefore appear in platform telemetry.
02 · Data source and methodology
OpenRouter publishes rolling 7-day statistics filterable by client application. We anchor this article to the window of June 2 through June 8, 2026 (UTC). The CLI category counts input plus output tokens attributed to each registered client ID in OpenRouter's application registry.
Key figures cited throughout:
- Hermes Agent: approximately 4.94 trillion tokens in the same week—ranked #1 across all OpenRouter clients, not only CLI tools. This includes Telegram bot sessions, persistent memory loops, and multi-channel gateway traffic documented in our Hermes Agent deployment guide.
- Kilo Code: approximately 1.22 trillion tokens—the highest pure terminal-coding client in the CLI-specific filter for the week.
- Claude Code: approximately 606 billion tokens—solid #2 in CLI category despite Anthropic's parallel first-party API path.
Rankings below #3 fall off quickly in absolute volume but remain strategically important because they represent different architectural bets: patch-only editors (Aider), IDE extensions (Cline), enterprise sandboxes (Goose), and vendor-native CLIs (Codex CLI, Qwen Code). Treat the board as a directional signal refreshed weekly, not a permanent hierarchy.
03 · Three adoption pain points
1. Conflating client popularity with model quality. Kilo Code's 1.22T weekly volume reflects both its UX and the default models its users select—often cost-optimized MoE routes from the model leaderboard we covered in 2026 LLM trends from OpenRouter rankings. Switching to Kilo because it tops the CLI chart without auditing your model ID is how teams accidentally route bulk cron jobs through Sonnet-tier pricing.
2. Installing three CLIs on a production Mac. Each terminal agent wants global npm permissions, shell profile hooks, and sometimes conflicting Node versions. Kilo, Cline, and Goose side-by-side on your daily-driver MacBook pollute Keychain entries, leave orphaned API keys in ~/.config, and make forensic cleanup painful when a trial goes wrong. The safer pattern—mirroring our Agent Skill isolation workflow—is to trial on a disposable macOS node, then promote one winner.
3. Ignoring hardware constraints for sandboxed agents. Goose and OpenCode spin Docker containers for tool execution. Codex CLI may invoke local sandboxes on Apple Silicon. A MacBook Air with 8 GB RAM will thrash under parallel agent loops while a Mac Studio with 64 GB runs Ollama fallback and Docker sidecars without swap pressure. Matching CLI ambition to silicon tier prevents false negatives during evaluation.
04 · CLI Top 10 for the week of June 2–8, 2026
The table below reflects OpenRouter's CLI client filter for the anchor week. Volumes are rounded; visit the live rankings page for current numbers.
| Rank | CLI Client | Weekly Tokens | Category | Notable trait |
|---|---|---|---|---|
| 1 | Kilo Code | ~1.22T | Terminal IDE | Multi-model routing; VS Code–like TUI; OpenRouter-native |
| 2 | Claude Code | ~606B | Anthropic CLI | Deep tool-use loops; strong on long refactors |
| 3 | Hermes Agent | ~4.94T* | Multi-channel agent | Platform #1 overall; Telegram, memory, skills |
| 4 | Aider | ~380B (est.) | Git-native patcher | Minimal UI; repo-map context |
| 5 | Cline | ~290B (est.) | VS Code extension + CLI | Human-in-the-loop approvals |
| 6 | Goose | ~210B (est.) | Block / enterprise | Recipe system; Docker sandbox |
| 7 | OpenCode | ~175B (est.) | Open-source terminal | Provider-agnostic; rapid iteration |
| 8 | Codex CLI | ~140B (est.) | OpenAI native | Tight GPT-5.x integration |
| 9 | Roo Code | ~95B (est.) | VS Code fork | Mode switching (architect/code/debug) |
| 10 | Qwen Code | ~80B (est.) | Alibaba native | Optimized for Qwen3 coder routes |
*Hermes Agent's 4.94T figure spans all client channels on OpenRouter, not only terminal sessions. It appears in the CLI top ten because a significant share of its traffic routes through CLI and gateway interfaces rather than browser chat alone.
The gap between #1 Kilo (1.22T) and #2 Claude Code (606B) is roughly 2×—material, but not insurmountable if Anthropic ships a pricing change or bundles Claude Code with enterprise seats. The gap between Kilo and #4 Aider (~380B estimated) is wider, suggesting the market prefers full terminal IDEs over patch-only workflows for OpenRouter-routed workloads.
05 · Feature comparison matrix
Token volume tells you who is winning today. Features tell you who fits your workflow tomorrow.
| Client | OpenRouter native | Multi-model switch | Tool / sandbox | IDE integration | Best fit |
|---|---|---|---|---|---|
| Kilo Code | Yes | Yes (hot-swap) | Shell, browser, MCP | Standalone TUI | Polyglot teams routing by task cost |
| Claude Code | Partial | Anthropic only* | Advanced tool use | Terminal + IDE plugins | Long autonomous refactors |
| Hermes Agent | Yes | Yes | Skills, cron, Telegram | Headless / gateway | Always-on ops bots |
| Aider | Yes | Yes | Git diff only | Editor agnostic | Minimalists; CI hooks |
| Cline | Yes | Yes | Browser, terminal | VS Code core | Teams wanting approval gates |
| Goose | Yes | Yes | Docker recipes | CLI + desktop | Regulated sandbox needs |
| OpenCode | Yes | Yes | Configurable | Terminal | Hackers customizing providers |
| Codex CLI | No (OpenAI direct) | GPT family | Cloud sandbox | Terminal | OpenAI-all-in shops |
| Roo Code | Yes | Yes | Multi-mode agents | VS Code fork | Structured role separation |
| Qwen Code | Partial | Qwen primary | Standard tools | Terminal | APAC Qwen deployments |
*Claude Code can reach OpenRouter via manual base-URL configuration, but most weekly volume flows through Anthropic's first-party endpoint and appears in CLI rankings only when explicitly proxied.
Two patterns emerge from the matrix. First, OpenRouter-native polyglot clients (Kilo, Hermes, Cline) dominate token share because teams mix models weekly as the model leaderboard shifts. Second, vendor-locked CLIs (Claude Code, Codex CLI, Qwen Code) trade flexibility for tighter integration—and still place top ten because enterprise procurement often mandates a single vendor relationship.
06 · Kilo Code vs Claude Code vs Hermes Agent
Kilo Code — volume king of terminal coding
At 1.22 trillion tokens for the week of June 2–8, Kilo Code is the clearest signal that developers want a terminal-native IDE with instant model switching. Its architecture treats OpenRouter as a dial: point at DeepSeek V4 Flash for inner loops, promote to Claude Sonnet 4.6 for merge-ready diffs, drop to free-tier models for spikes. That aligns with the tiered routing philosophy from our billing-truth article—except the client itself becomes the routing layer.
Kilo's weakness is operational surface area. More models means more failure modes: context overflow on one provider, tool-schema mismatch on another, rate-limit cascades when a cron job hot-swaps mid-run. Teams without budget caps have reported invoice surprises when a default model ID changed between Kilo releases.
Claude Code — quality floor at 606B tokens
606 billion weekly tokens is not #1, but it is remarkable for a client tightly coupled to Anthropic's stack. Claude Code earns its rank on long-horizon tasks: multi-file refactors, test generation across modules, and tool chains that tolerate higher latency in exchange for fewer rollback events. Platform leads often keep Claude Code as the escalation path while routing bulk work through Kilo plus cheap MoE models.
The trade-off is cost. Even when proxied through OpenRouter, Claude-tier output pricing dwarfs DeepSeek routes. A team running Claude Code as default for CI bots will show healthy CLI rankings and unhealthy finance reviews.
Hermes Agent — 4.94T and platform #1
Hermes is the outlier. Its 4.94 trillion token week—not CLI-filtered alone—reflects persistent agents that never sleep: Telegram ops bots, scheduled skill invocations, memory-consolidation passes, and gateway traffic from OpenClaw-compatible setups. Hermes behaves less like a coding IDE and more like infrastructure that happens to call LLMs continuously.
If your use case is "developer edits Swift for four hours," Hermes is the wrong comparison. If your use case is "always-on SRE bot that reads logs, opens PRs, and pings on-call," Hermes's weekly volume explains why it tops the entire OpenRouter client board while sitting #3 in the CLI-specific filter. Deployment patterns are covered in our 30-day Hermes review on rented Mac mini M4.
07 · Why CLI rankings matter in June 2026
Three structural shifts make CLI client telemetry unavoidable for platform decisions.
First, programming traffic crossed 50% of categorized OpenRouter usage in 2026—up from roughly 11% in early 2025. Terminal agents are no longer a niche; they are the majority use case. Second, client consolidation is accelerating: the gap between #1 and #10 in the CLI board is smaller in relative terms than the model board, meaning tooling choices flip faster as UX improves. Third, enterprise procurement now asks for client audit trails—which CLI touched production repos, which API keys it stored, whether it sandboxed shell execution.
Investors and competitors watch the same public rankings. When Kilo overtakes Claude Code in weekly CLI tokens, it signals a generational preference for model-agnostic terminals over vendor bundles—regardless of whether Anthropic's models still win on quality benchmarks.
08 · Mac rental hardware tiers for CLI trials
CLI agents stress CPU, RAM, and disk differently than browser chat. Match rental tier to your evaluation scope.
| Tier | Hardware | RAM | Ideal CLI workload | Limitations |
|---|---|---|---|---|
| Light | MacBook Air M2 / M3 | 16 GB | Aider, Kilo single-repo, OpenCode smoke tests | No Docker; avoid parallel agents |
| Medium | MacBook Pro M3 | 16–32 GB | Kilo + Cline, Claude Code refactors, 2 concurrent loops | Local Ollama marginal at 16 GB |
| Heavy | MacBook Pro M4 Pro / M4 Max | 32 GB+ | Goose Docker recipes, sandboxed Codex-style runs | Studio-class local models still impractical |
| Lab | Mac Studio M4 Ultra | 64 GB+ | Hermes + local Ollama fallback, parallel gateway tests | Overkill for patch-only trials |
Pricing and availability live on bare-metal macOS pricing. For SSH setup, VNC fallback, and day-rent billing mechanics, see the daily Mac rental FAQ. Most CLI shootouts complete in one to three rental days on a Mac mini M4 16 GB—convert CapEx curiosity into OpEx evidence before buying a Studio for Hermes plus Ollama.
09 · Scenario selection matrix
| Scenario | Primary CLI | Fallback | Mac tier | Why rankings agree |
|---|---|---|---|---|
| Polyglot model routing (weekly model churn) | Kilo Code | OpenCode | Medium MBP M3 32 GB | CLI #1 at 1.22T; hot-swap without restart |
| 12-hour autonomous refactor | Claude Code | Kilo + Sonnet route | Medium | 606B weekly tokens; quality escalation path |
| Always-on ops / Telegram bot | Hermes Agent | OpenClaw gateway | Lab Studio 64 GB+ | 4.94T platform #1; persistent memory |
| Minimal git-only patches | Aider | Cline | Light Air M3 16 GB | Lower token volume; fast CI integration |
| Regulated shell sandbox | Goose | Codex CLI | Heavy M4 Pro 32 GB+ | Docker isolation; enterprise audit |
| Human approval on every diff | Cline | Roo Code | Medium | IDE-native review gates |
Revisit this matrix every Monday alongside the model leaderboard. CLI clients lag model shifts by one to two weeks—Kilo's volume spike in early June followed DeepSeek V4 Flash's model-board dominance in late May. Synchronizing both boards prevents mismatched defaults.
10 · Five-step CLI trial on a rented Mac
- Rent an isolated macOS node. Book Mac mini M4 or MacBook Pro M3 through MacDate; SSH in per the daily rental FAQ. Create a local user with no production Apple ID or signing certificates.
- Install two CLI shortlist candidates. Example: Kilo Code plus Aider or Cline. Store
OPENROUTER_API_KEYin a project-scoped.env—never export to global shell profiles on this box. - Run a fixed benchmark task. Choose a 12k-token coding task with at least three tool calls (read file, edit, run tests). Execute identically on each CLI. Log prompt tokens, completion tokens, wall time, and tool failures.
- Cross-check OpenRouter CLI rankings. Open the rankings page CLI filter; confirm whether your chosen client's weekly volume trend matches your own usage intensity. Export OpenRouter usage CSV for dollar comparison.
- Archive and release. Save
cli-benchmark-YYYYMMDD.csv, revoke the test API key, uninstall CLIs, and wipe the rental per MacDate's return checklist.
# Rented Mac CLI trial — Kilo Code install (example)curl -fsSL https://get.kilo.ai | bashexport OPENROUTER_API_KEY="sk-or-..."kilo config set provider openrouterkilo config set model deepseek/deepseek-v4-flash# Fixed benchmark — same repo path for each CLIkilo "Add unit tests for AuthService.swift and run xcodebuild test"
# Pull OpenRouter key usage for weekly comparisoncurl -s -H "Authorization: Bearer $OPENROUTER_API_KEY" \ https://openrouter.ai/api/v1/auth/key | jq '.data.usage'# Snapshot CLI rankings (manual): open https://openrouter.ai/rankings?category=cli
You could run the same installs on a personal MacBook, but that path collides with production Xcode signing identities, pollutes Keychain with experimental API keys, and leaves Hermes or Goose Docker volumes consuming disk between trials. A rented Mac is a forensic clean room: if a CLI misroutes Opus pricing into a cron loop, the blast radius ends when you release the instance. If Kilo's 1.22T weekly dominance convinces you to adopt it—but Claude Code's 606B figure reflects quality you still need for merges—you can run both on separate rental days without uninstall gymnastics.
For Apple-platform teams, the rented-Mac loop also keeps pace with toolchain churn: Swift 6 strict concurrency, Xcode 26 betas, and OpenClaw gateway updates ship faster than most CLI release notes. Testing agents on the same macOS build you ship against—not a Linux VPS with mismatched file paths—surfaces real failures before they hit CI. When your benchmark CSV shows a clear winner, promote one CLI to your team standard; until then, treat the OpenRouter board as the prior and your rental harness as the posterior.