Quick Start
Reduce AI API costs with two lines of configuration. Savings grow with conversation length. No code changes, no SDK, no vendor lock-in.
Sign up and get your LEXI API key
Create an account at lexisaas.com and copy your API key from the dashboard.
Set your base URL
export OPENAI_BASE_URL=https://api.lexisaas.com/v1
Set your API key as a composite key
Combine your LEXI key and provider key into a single api_key parameter:
lx_live_your_key_here:sk-your-provider-key
That's it. Works with any OpenAI-compatible client.
LEXI is a drop-in proxy. Set the base URL and API key in Cursor, Continue, Chatbox, or any SDK. No custom headers, no SDK, no vendor lock-in.
Authentication
LEXI uses a single composite API key that combines your LEXI key and provider key. This works with any OpenAI-compatible client — no custom headers needed.
Composite Key Format
Combine your LEXI key and provider key with a colon. Pass it as the standard api_key or Authorization header.
Authorization: Bearer lx_live_your_key_here:sk-your-provider-key
Server-Side (Provider Key from Env Var)
If your provider key is set via OPENAI_API_KEY env var, you can send just the LEXI key without a colon. LEXI will use the env var as fallback.
Authorization: Bearer lx_live_your_key_here
LEXI never stores your LLM API key. It is held in memory only for the duration of the request and used solely to forward your request to the upstream provider. Anthropic-native clients can use x-api-key with the same composite format.
Python (OpenAI)
Drop-in replacement using the official openai Python package. Set the base URL and use the composite key as your API key.
from openai import OpenAI
client = OpenAI(
base_url="https://api.lexisaas.com/v1",
api_key="lx_live_your_key_here:sk-your-openai-key"
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello, world!"}]
)
print(response.choices[0].message.content)
Requires openai >= 1.0. Install with pip install openai.
Node.js (OpenAI)
Works with the official openai npm package. Set baseURL and use the composite key as your API key.
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://api.lexisaas.com/v1',
apiKey: 'lx_live_your_key_here:sk-your-openai-key'
});
const response = await client.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: 'Hello, world!' }]
});
console.log(response.choices[0].message.content);
Requires openai >= 4.0. Install with npm install openai.
cURL
Test directly from your terminal. Pass the composite key in the Authorization header.
curl https://api.lexisaas.com/v1/chat/completions \
-H "Authorization: Bearer lx_live_your_key_here:sk-your-openai-key" \
-H "Content-Type: application/json" \
-d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hello"}]}'
Anthropic
Use the official Anthropic Python SDK with a custom base URL. The composite key goes in the api_key parameter.
import anthropic
client = anthropic.Anthropic(
base_url="https://api.lexisaas.com/v1",
api_key="lx_live_your_key_here:sk-ant-your-anthropic-key",
)
message = client.messages.create(
model="claude-sonnet-4-5-20250929",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello, world!"}]
)
print(message.content[0].text)
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({
baseURL: 'https://api.lexisaas.com/v1',
apiKey: 'lx_live_your_key_here:sk-ant-your-anthropic-key',
});
const message = await client.messages.create({
model: 'claude-sonnet-4-5-20250929',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Hello, world!' }]
});
console.log(message.content[0].text);
The Anthropic SDK sends the key via x-api-key header. LEXI supports the composite format in both Authorization and x-api-key headers automatically.
Google Gemini, xAI Grok, DeepSeek, Meta
LEXI auto-detects the provider from the model name. Just use the model name you normally would. All providers use the same OpenAI-compatible endpoint.
from openai import OpenAI
client = OpenAI(
base_url="https://api.lexisaas.com/v1",
api_key="lx_live_your_key_here:your-google-api-key"
)
response = client.chat.completions.create(
model="gemini-2.5-pro", # Auto-routes to Google
messages=[{"role": "user", "content": "Hello!"}]
)
curl https://api.lexisaas.com/v1/chat/completions \
-H "Authorization: Bearer lx_live_your_key_here:xai-your-grok-key" \
-H "Content-Type: application/json" \
-d '{"model":"grok-3","messages":[{"role":"user","content":"Hello"}]}'
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://api.lexisaas.com/v1',
apiKey: 'lx_live_your_key_here:your-deepseek-key'
});
const response = await client.chat.completions.create({
model: 'deepseek-chat', // Auto-routes to DeepSeek
messages: [{ role: 'user', content: 'Hello!' }]
});
LEXI automatically detects your LLM provider from the model name. Simply use your standard model identifiers — no configuration needed. Your provider API key is embedded in the composite key after the colon.
Response Headers
LEXI adds diagnostic headers to every response so you can monitor compression performance and debug issues.
| Header | Description |
|---|---|
X-Lexi-Tokens-Saved |
Number of tokens saved by STONE compression on this request. |
X-Lexi-Compression-Ratio |
Compression ratio achieved, from 0.0 (no compression) to 1.0 (maximum). |
X-Lexi-Fallback |
Set to "true" if compression was bypassed and the raw request was forwarded. |
X-Request-Id |
Unique request identifier. Include this when contacting support. |
X-RateLimit-Limit |
Your rate limit cap (requests per minute) for the current billing tier. |
X-RateLimit-Remaining |
Requests remaining in the current rate limit window. |
Rate Limits
Rate limits are applied per API key. When exceeded, requests return 429 Too Many Requests with a Retry-After header.
| Lifetime Spend | Requests / min | API Keys |
|---|---|---|
| $0 (free credits) | 30 | 10 |
| $5+ | 100 | 100 |
| $50+ | 500 | 100 |
| $500+ | 2,000 | 100 |
| $1,000+ | 5,000 | 100 |
Supported Models
LEXI works with any model available through the supported providers. Just use the model name you normally would — LEXI auto-routes to the right provider based on the model name.
OpenAI (10 models)
- gpt-5 / gpt-5.1 / gpt-5.2
- gpt-4o
- gpt-4.1 / gpt-4.1-mini / gpt-4.1-nano
- o3 / o3-mini / o4-mini
Anthropic (3 models)
- claude-opus-4-6
- claude-sonnet-4-5-20250929
- claude-haiku-4-5-20251001
Google Gemini (5 models)
- gemini-3-pro / gemini-3-flash
- gemini-2.5-pro / gemini-2.5-flash
- gemini-2.0-flash
xAI Grok (4 models)
- grok-4 / grok-4.1-fast
- grok-3 / grok-3-mini
DeepSeek (2 models)
- deepseek-chat
- deepseek-reasoner
Meta (2 models)
- llama-4
- llama-3.3-70b
Free passthrough on 9 models
Models priced under $0.50/1M input tokens are proxied at no LEXI fee — you only pay the provider cost. Free passthrough models: gpt-4.1-mini, gpt-4.1-nano, gpt-4o-mini, gemini-3-flash, gemini-2.5-flash, gemini-2.0-flash, deepseek-chat, llama-4, llama-3.3-70b.
Frequently Asked Questions
Common questions about how LEXI Cloud works, data handling, and compatibility.
X-Lexi-Fallback response header is set to "true" when this happens. You still get a valid response from the upstream provider — you just don't get the token savings on that particular request.