Data Analysis 2026-06-08

OpenRouter CLI Tools Top 10
June 2026 Leaderboard

If your team still picks coding agents from GitHub star counts instead of OpenRouter's 7-day CLI client throughput, you are optimizing for marketing, not production. This article is for developers and platform leads evaluating terminal-first AI workflows. We analyze the week of June 2–8, 2026—where Kilo Code leads the CLI category at 1.22 trillion tokens, Claude Code holds second at 606 billion, and Hermes Agent drives 4.94 trillion as the platform's overall #1 client—and deliver a feature matrix, Mac rental hardware tiers, scenario routing table, and a five-step rented-Mac trial protocol.

OpenRouter CLI tools weekly token ranking chart showing Kilo Code, Claude Code, and Hermes Agent usage

01 · Why billing data beats CLI hype

GitHub READMEs promise "autonomous coding." OpenRouter's CLI client rankings show what developers actually route through paid API keys when nobody is watching. A terminal agent that burns tokens on failed tool loops still counts toward the weekly leaderboard—and your invoice. That makes the CLI filter on openrouter.ai/rankings a sharper signal than any Product Hunt launch.

OpenRouter aggregates 300+ models from 60+ providers, processing on the order of 100 trillion tokens per month across more than 8 million users. When Kilo Code climbs to the CLI #1 slot, it means thousands of independent teams chose that client as their default gateway—not that a single vendor ran a promotion. This complements our earlier analysis of weekly model token rankings, which focused on which models absorb traffic; here we focus on which CLI clients generate it.

The distinction matters because model routing and client choice are separate decisions. You might route through DeepSeek V4 Flash for cost while using Kilo Code as the orchestration layer—or run Claude Code against Anthropic's native API with zero OpenRouter involvement. The CLI leaderboard captures the first pattern: clients that explicitly proxy through OpenRouter's unified endpoint and therefore appear in platform telemetry.

02 · Data source and methodology

OpenRouter publishes rolling 7-day statistics filterable by client application. We anchor this article to the window of June 2 through June 8, 2026 (UTC). The CLI category counts input plus output tokens attributed to each registered client ID in OpenRouter's application registry.

Key figures cited throughout:

  • Hermes Agent: approximately 4.94 trillion tokens in the same week—ranked #1 across all OpenRouter clients, not only CLI tools. This includes Telegram bot sessions, persistent memory loops, and multi-channel gateway traffic documented in our Hermes Agent deployment guide.
  • Kilo Code: approximately 1.22 trillion tokens—the highest pure terminal-coding client in the CLI-specific filter for the week.
  • Claude Code: approximately 606 billion tokens—solid #2 in CLI category despite Anthropic's parallel first-party API path.

Rankings below #3 fall off quickly in absolute volume but remain strategically important because they represent different architectural bets: patch-only editors (Aider), IDE extensions (Cline), enterprise sandboxes (Goose), and vendor-native CLIs (Codex CLI, Qwen Code). Treat the board as a directional signal refreshed weekly, not a permanent hierarchy.

03 · Three adoption pain points

1. Conflating client popularity with model quality. Kilo Code's 1.22T weekly volume reflects both its UX and the default models its users select—often cost-optimized MoE routes from the model leaderboard we covered in 2026 LLM trends from OpenRouter rankings. Switching to Kilo because it tops the CLI chart without auditing your model ID is how teams accidentally route bulk cron jobs through Sonnet-tier pricing.

2. Installing three CLIs on a production Mac. Each terminal agent wants global npm permissions, shell profile hooks, and sometimes conflicting Node versions. Kilo, Cline, and Goose side-by-side on your daily-driver MacBook pollute Keychain entries, leave orphaned API keys in ~/.config, and make forensic cleanup painful when a trial goes wrong. The safer pattern—mirroring our Agent Skill isolation workflow—is to trial on a disposable macOS node, then promote one winner.

3. Ignoring hardware constraints for sandboxed agents. Goose and OpenCode spin Docker containers for tool execution. Codex CLI may invoke local sandboxes on Apple Silicon. A MacBook Air with 8 GB RAM will thrash under parallel agent loops while a Mac Studio with 64 GB runs Ollama fallback and Docker sidecars without swap pressure. Matching CLI ambition to silicon tier prevents false negatives during evaluation.

04 · CLI Top 10 for the week of June 2–8, 2026

The table below reflects OpenRouter's CLI client filter for the anchor week. Volumes are rounded; visit the live rankings page for current numbers.

RankCLI ClientWeekly TokensCategoryNotable trait
1Kilo Code~1.22TTerminal IDEMulti-model routing; VS Code–like TUI; OpenRouter-native
2Claude Code~606BAnthropic CLIDeep tool-use loops; strong on long refactors
3Hermes Agent~4.94T*Multi-channel agentPlatform #1 overall; Telegram, memory, skills
4Aider~380B (est.)Git-native patcherMinimal UI; repo-map context
5Cline~290B (est.)VS Code extension + CLIHuman-in-the-loop approvals
6Goose~210B (est.)Block / enterpriseRecipe system; Docker sandbox
7OpenCode~175B (est.)Open-source terminalProvider-agnostic; rapid iteration
8Codex CLI~140B (est.)OpenAI nativeTight GPT-5.x integration
9Roo Code~95B (est.)VS Code forkMode switching (architect/code/debug)
10Qwen Code~80B (est.)Alibaba nativeOptimized for Qwen3 coder routes

*Hermes Agent's 4.94T figure spans all client channels on OpenRouter, not only terminal sessions. It appears in the CLI top ten because a significant share of its traffic routes through CLI and gateway interfaces rather than browser chat alone.

The gap between #1 Kilo (1.22T) and #2 Claude Code (606B) is roughly —material, but not insurmountable if Anthropic ships a pricing change or bundles Claude Code with enterprise seats. The gap between Kilo and #4 Aider (~380B estimated) is wider, suggesting the market prefers full terminal IDEs over patch-only workflows for OpenRouter-routed workloads.

05 · Feature comparison matrix

Token volume tells you who is winning today. Features tell you who fits your workflow tomorrow.

ClientOpenRouter nativeMulti-model switchTool / sandboxIDE integrationBest fit
Kilo CodeYesYes (hot-swap)Shell, browser, MCPStandalone TUIPolyglot teams routing by task cost
Claude CodePartialAnthropic only*Advanced tool useTerminal + IDE pluginsLong autonomous refactors
Hermes AgentYesYesSkills, cron, TelegramHeadless / gatewayAlways-on ops bots
AiderYesYesGit diff onlyEditor agnosticMinimalists; CI hooks
ClineYesYesBrowser, terminalVS Code coreTeams wanting approval gates
GooseYesYesDocker recipesCLI + desktopRegulated sandbox needs
OpenCodeYesYesConfigurableTerminalHackers customizing providers
Codex CLINo (OpenAI direct)GPT familyCloud sandboxTerminalOpenAI-all-in shops
Roo CodeYesYesMulti-mode agentsVS Code forkStructured role separation
Qwen CodePartialQwen primaryStandard toolsTerminalAPAC Qwen deployments

*Claude Code can reach OpenRouter via manual base-URL configuration, but most weekly volume flows through Anthropic's first-party endpoint and appears in CLI rankings only when explicitly proxied.

Two patterns emerge from the matrix. First, OpenRouter-native polyglot clients (Kilo, Hermes, Cline) dominate token share because teams mix models weekly as the model leaderboard shifts. Second, vendor-locked CLIs (Claude Code, Codex CLI, Qwen Code) trade flexibility for tighter integration—and still place top ten because enterprise procurement often mandates a single vendor relationship.

06 · Kilo Code vs Claude Code vs Hermes Agent

Kilo Code — volume king of terminal coding

At 1.22 trillion tokens for the week of June 2–8, Kilo Code is the clearest signal that developers want a terminal-native IDE with instant model switching. Its architecture treats OpenRouter as a dial: point at DeepSeek V4 Flash for inner loops, promote to Claude Sonnet 4.6 for merge-ready diffs, drop to free-tier models for spikes. That aligns with the tiered routing philosophy from our billing-truth article—except the client itself becomes the routing layer.

Kilo's weakness is operational surface area. More models means more failure modes: context overflow on one provider, tool-schema mismatch on another, rate-limit cascades when a cron job hot-swaps mid-run. Teams without budget caps have reported invoice surprises when a default model ID changed between Kilo releases.

Claude Code — quality floor at 606B tokens

606 billion weekly tokens is not #1, but it is remarkable for a client tightly coupled to Anthropic's stack. Claude Code earns its rank on long-horizon tasks: multi-file refactors, test generation across modules, and tool chains that tolerate higher latency in exchange for fewer rollback events. Platform leads often keep Claude Code as the escalation path while routing bulk work through Kilo plus cheap MoE models.

The trade-off is cost. Even when proxied through OpenRouter, Claude-tier output pricing dwarfs DeepSeek routes. A team running Claude Code as default for CI bots will show healthy CLI rankings and unhealthy finance reviews.

Hermes Agent — 4.94T and platform #1

Hermes is the outlier. Its 4.94 trillion token week—not CLI-filtered alone—reflects persistent agents that never sleep: Telegram ops bots, scheduled skill invocations, memory-consolidation passes, and gateway traffic from OpenClaw-compatible setups. Hermes behaves less like a coding IDE and more like infrastructure that happens to call LLMs continuously.

If your use case is "developer edits Swift for four hours," Hermes is the wrong comparison. If your use case is "always-on SRE bot that reads logs, opens PRs, and pings on-call," Hermes's weekly volume explains why it tops the entire OpenRouter client board while sitting #3 in the CLI-specific filter. Deployment patterns are covered in our 30-day Hermes review on rented Mac mini M4.

07 · Why CLI rankings matter in June 2026

Three structural shifts make CLI client telemetry unavoidable for platform decisions.

First, programming traffic crossed 50% of categorized OpenRouter usage in 2026—up from roughly 11% in early 2025. Terminal agents are no longer a niche; they are the majority use case. Second, client consolidation is accelerating: the gap between #1 and #10 in the CLI board is smaller in relative terms than the model board, meaning tooling choices flip faster as UX improves. Third, enterprise procurement now asks for client audit trails—which CLI touched production repos, which API keys it stored, whether it sandboxed shell execution.

Investors and competitors watch the same public rankings. When Kilo overtakes Claude Code in weekly CLI tokens, it signals a generational preference for model-agnostic terminals over vendor bundles—regardless of whether Anthropic's models still win on quality benchmarks.

08 · Mac rental hardware tiers for CLI trials

CLI agents stress CPU, RAM, and disk differently than browser chat. Match rental tier to your evaluation scope.

TierHardwareRAMIdeal CLI workloadLimitations
LightMacBook Air M2 / M316 GBAider, Kilo single-repo, OpenCode smoke testsNo Docker; avoid parallel agents
MediumMacBook Pro M316–32 GBKilo + Cline, Claude Code refactors, 2 concurrent loopsLocal Ollama marginal at 16 GB
HeavyMacBook Pro M4 Pro / M4 Max32 GB+Goose Docker recipes, sandboxed Codex-style runsStudio-class local models still impractical
LabMac Studio M4 Ultra64 GB+Hermes + local Ollama fallback, parallel gateway testsOverkill for patch-only trials

Pricing and availability live on bare-metal macOS pricing. For SSH setup, VNC fallback, and day-rent billing mechanics, see the daily Mac rental FAQ. Most CLI shootouts complete in one to three rental days on a Mac mini M4 16 GB—convert CapEx curiosity into OpEx evidence before buying a Studio for Hermes plus Ollama.

09 · Scenario selection matrix

ScenarioPrimary CLIFallbackMac tierWhy rankings agree
Polyglot model routing (weekly model churn)Kilo CodeOpenCodeMedium MBP M3 32 GBCLI #1 at 1.22T; hot-swap without restart
12-hour autonomous refactorClaude CodeKilo + Sonnet routeMedium606B weekly tokens; quality escalation path
Always-on ops / Telegram botHermes AgentOpenClaw gatewayLab Studio 64 GB+4.94T platform #1; persistent memory
Minimal git-only patchesAiderClineLight Air M3 16 GBLower token volume; fast CI integration
Regulated shell sandboxGooseCodex CLIHeavy M4 Pro 32 GB+Docker isolation; enterprise audit
Human approval on every diffClineRoo CodeMediumIDE-native review gates

Revisit this matrix every Monday alongside the model leaderboard. CLI clients lag model shifts by one to two weeks—Kilo's volume spike in early June followed DeepSeek V4 Flash's model-board dominance in late May. Synchronizing both boards prevents mismatched defaults.

10 · Five-step CLI trial on a rented Mac

  1. Rent an isolated macOS node. Book Mac mini M4 or MacBook Pro M3 through MacDate; SSH in per the daily rental FAQ. Create a local user with no production Apple ID or signing certificates.
  2. Install two CLI shortlist candidates. Example: Kilo Code plus Aider or Cline. Store OPENROUTER_API_KEY in a project-scoped .env—never export to global shell profiles on this box.
  3. Run a fixed benchmark task. Choose a 12k-token coding task with at least three tool calls (read file, edit, run tests). Execute identically on each CLI. Log prompt tokens, completion tokens, wall time, and tool failures.
  4. Cross-check OpenRouter CLI rankings. Open the rankings page CLI filter; confirm whether your chosen client's weekly volume trend matches your own usage intensity. Export OpenRouter usage CSV for dollar comparison.
  5. Archive and release. Save cli-benchmark-YYYYMMDD.csv, revoke the test API key, uninstall CLIs, and wipe the rental per MacDate's return checklist.
# Rented Mac CLI trial — Kilo Code install (example)
curl -fsSL https://get.kilo.ai | bash
export OPENROUTER_API_KEY="sk-or-..."
kilo config set provider openrouter
kilo config set model deepseek/deepseek-v4-flash
# Fixed benchmark — same repo path for each CLI
kilo "Add unit tests for AuthService.swift and run xcodebuild test"
# Pull OpenRouter key usage for weekly comparison
curl -s -H "Authorization: Bearer $OPENROUTER_API_KEY" \
https://openrouter.ai/api/v1/auth/key | jq '.data.usage'
# Snapshot CLI rankings (manual): open https://openrouter.ai/rankings?category=cli

You could run the same installs on a personal MacBook, but that path collides with production Xcode signing identities, pollutes Keychain with experimental API keys, and leaves Hermes or Goose Docker volumes consuming disk between trials. A rented Mac is a forensic clean room: if a CLI misroutes Opus pricing into a cron loop, the blast radius ends when you release the instance. If Kilo's 1.22T weekly dominance convinces you to adopt it—but Claude Code's 606B figure reflects quality you still need for merges—you can run both on separate rental days without uninstall gymnastics.

For Apple-platform teams, the rented-Mac loop also keeps pace with toolchain churn: Swift 6 strict concurrency, Xcode 26 betas, and OpenClaw gateway updates ship faster than most CLI release notes. Testing agents on the same macOS build you ship against—not a Linux VPS with mismatched file paths—surfaces real failures before they hit CI. When your benchmark CSV shows a clear winner, promote one CLI to your team standard; until then, treat the OpenRouter board as the prior and your rental harness as the posterior.

Further Reading