AI
Builder Hub
OpenClaw + CLIProxyAPI + ProxyPal: Use Multiple AI Providers, Never Hit Token Limits Again
buildAI2026-03-179 min

OpenClaw + CLIProxyAPI + ProxyPal: Use Multiple AI Providers, Never Hit Token Limits Again

How to configure OpenClaw with CLIProxyAPI and ProxyPal to run Claude, GPT-4o, Gemini, and Qwen simultaneously — with round-robin load balancing, automatic failover on rate limits, and a desktop GUI for token monitoring.

One of the biggest pain points when using AI agents for daily work is token limits — mid-task, Claude or GPT-4o throws a "rate limit exceeded" error and you're stuck waiting. The solution: combine CLIProxyAPI + ProxyPal with OpenClaw to run multiple providers in parallel with automatic failover when quota is exhausted.

OpenClaw + CLIProxyAPI + ProxyPal architecture – load balancing multiple AI providers

OpenClaw → CLIProxyAPI → Claude / GPT-4o / Gemini / DeepSeek / Qwen — automatically switches provider when quota is hit


The 3 Components

1. OpenClaw 🦞

Local-first AI personal agent — runs tasks, connects to Telegram/Discord, manages files, browses the web. Read the OpenClaw setup guide if you haven't installed it yet.

2. CLIProxyAPI

A proxy server that wraps CLI AI models (Claude Code, Gemini CLI, OpenAI Codex, Qwen Code) and exposes them as API endpoints compatible with the OpenAI/Gemini/Claude format. No separate API keys needed — it uses your existing subscriptions.

Key features:

  • Wraps CLI agents → standard OpenAI API format
  • Round-robin load balancing across multiple accounts
  • Auto failover when a provider hits rate limits
  • OAuth support (no raw API key exposure)

3. ProxyPal

A desktop GUI for managing CLIProxyAPI — a desktop interface to add providers, view token usage, and monitor request logs without touching the CLI.

Features:

  • Manage subscriptions: Claude, ChatGPT, Gemini, GitHub Copilot
  • GitHub Copilot Bridge
  • Antigravity Support
  • Usage analytics + real-time token monitoring
  • Auto-detects and configures installed CLI agents

How It Works

You message Telegram → OpenClaw receives → sends request → CLIProxyAPI
                                                               │
                                   ┌───────────────────────────┤
                                   ▼                           ▼
                             Claude Code CLI              GPT-4o CLI
                             (primary)                   (fallback 1)
                                   │ rate limited?
                                   ▼
                             Gemini CLI                  Qwen Code CLI
                             (fallback 2)                (fallback 3)

CLIProxyAPI routing logic:

  1. Sends request to primary provider (Claude)
  2. If rate limited → cooldown 1 min → 5 min → 25 min → 1 hour
  3. Automatically switches to next fallback in the list
  4. If billing issue → 5-hour backoff, doubling up to 24 hours
  5. OpenClaw receives the result — unaware of which provider handled it

Installation

Step 1: Install CLIProxyAPI

npm install -g cliproxyapi@latest
# or with pnpm:
pnpm add -g cliproxyapi@latest

Step 2: Install the CLI Agents You Have Subscriptions For

# Claude Code
npm install -g @anthropic-ai/claude-code

# Gemini CLI (Google)
npm install -g @google/gemini-cli

# OpenAI Codex CLI
npm install -g @openai/codex

Step 3: Install ProxyPal (Desktop GUI)

Download ProxyPal from proxypal.app — available for macOS, Windows, and Linux.

Once opened:

  1. ProxyPal auto-detects installed CLI agents
  2. Add your subscription credentials
  3. Toggle each provider on/off

Step 4: Start the CLIProxyAPI Server

# Start server (default port 4141):
cliproxyapi start --port 4141

# Or run in background:
cliproxyapi start --port 4141 --daemon

Verify the server is running:

curl http://localhost:4141/v1/models

Configure OpenClaw to Use CLIProxyAPI

Edit ~/.openclaw/openclaw.json:

{
  "agent": {
    "model": {
      "primary": "anthropic/claude-opus-4-6",
      "fallbacks": [
        "openai/gpt-4o",
        "google-antigravity/gemini-2.5-pro",
        "openai/gpt-4o-mini"
      ]
    }
  },
  "providers": {
    "anthropic": {
      "baseUrl": "http://localhost:4141/anthropic"
    },
    "openai": {
      "baseUrl": "http://localhost:4141/openai"
    },
    "google-antigravity": {
      "baseUrl": "http://localhost:4141/gemini"
    }
  }
}

What this does:

  • primary → model used first for every request
  • fallbacks → ordered list of fallback providers
  • baseUrl → points to CLIProxyAPI instead of cloud APIs directly

Advanced Failover Configuration

Adjust Cooldown Timings

{
  "auth": {
    "cooldowns": {
      "billingBackoffHours": 5,
      "billingMaxHours": 24,
      "failureWindowHours": 24
    }
  }
}
KeyDefaultMeaning
billingBackoffHours5hInitial wait time on billing failure
billingMaxHours24hMaximum wait time
failureWindowHours24hResets error counter if no failure in this window

Multi-Account for the Same Provider

If you have 2 Claude accounts, CLIProxyAPI round-robins between them:

Edit ~/.openclaw/agents/<agentId>/agent/auth-profiles.json:

{
  "profiles": {
    "anthropic:account1@gmail.com": {
      "type": "oauth",
      "provider": "anthropic",
      "email": "account1@gmail.com"
    },
    "anthropic:account2@gmail.com": {
      "type": "oauth",
      "provider": "anthropic",
      "email": "account2@gmail.com"
    }
  }
}

OpenClaw will:

  1. Use account1 first (oldest last-used first)
  2. When account1 hits rate limit → cooldown → switch to account2
  3. When both are rate limited → switch to next model fallback

Monitoring via ProxyPal GUI

Once set up, open ProxyPal to see:

  • Dashboard: request count and token usage in real-time
  • Provider Status: green (active), yellow (cooldown), red (billing issue)
  • Request Logs: which provider handled each request
  • Usage Analytics: token consumption by day/week/model

No CLI commands needed — everything visible through the desktop GUI.


Real-World Use Case: "Always-On Agent"

Scenario: You have Claude Pro ($20/month) and ChatGPT Plus ($20/month). Instead of getting blocked when one hits its quota during peak hours, this setup automatically distributes load:

Morning (8am-12pm): Claude handles most requests
12pm: Claude hits rate limit
            ↓ auto failover
12pm-2pm: GPT-4o takes over
2pm: Claude cooldown resets
            ↓ back to Claude
2pm-6pm: Round-robin between both providers

Result: zero downtime, the agent always responds, and your effective token capacity is nearly doubled.


Important Notes

  • CLIProxyAPI requires the corresponding CLI agents to already be logged in (OAuth session active)
  • Do not expose CLIProxyAPI to the internet — keep it on localhost or access it via Tailscale
  • If using ProxyPal, ensure the CLIProxyAPI daemon is running before OpenClaw starts

Resources: