Skip to content
Getting Started

Quick Start

Reduce AI API costs with two lines of configuration. Savings grow with conversation length. No code changes, no SDK, no vendor lock-in.

Sign up and get your LEXI API key

Create an account at lexisaas.com and copy your API key from the dashboard.

Set your base URL

Shell
export OPENAI_BASE_URL=https://api.lexisaas.com/v1

Set your API key as a composite key

Combine your LEXI key and provider key into a single api_key parameter:

API Key
lx_live_your_key_here:sk-your-provider-key

That's it. Works with any OpenAI-compatible client.

LEXI is a drop-in proxy. Set the base URL and API key in Cursor, Continue, Chatbox, or any SDK. No custom headers, no SDK, no vendor lock-in.

Authentication

LEXI uses a single composite API key that combines your LEXI key and provider key. This works with any OpenAI-compatible client — no custom headers needed.

Composite Key Format

Combine your LEXI key and provider key with a colon. Pass it as the standard api_key or Authorization header.

Authorization: Bearer lx_live_your_key_here:sk-your-provider-key

Server-Side (Provider Key from Env Var)

If your provider key is set via OPENAI_API_KEY env var, you can send just the LEXI key without a colon. LEXI will use the env var as fallback.

Authorization: Bearer lx_live_your_key_here

LEXI never stores your LLM API key. It is held in memory only for the duration of the request and used solely to forward your request to the upstream provider. Anthropic-native clients can use x-api-key with the same composite format.

Python (OpenAI)

Drop-in replacement using the official openai Python package. Set the base URL and use the composite key as your API key.

Python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.lexisaas.com/v1",
    api_key="lx_live_your_key_here:sk-your-openai-key"
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello, world!"}]
)
print(response.choices[0].message.content)

Requires openai >= 1.0. Install with pip install openai.

Node.js (OpenAI)

Works with the official openai npm package. Set baseURL and use the composite key as your API key.

JavaScript
import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.lexisaas.com/v1',
  apiKey: 'lx_live_your_key_here:sk-your-openai-key'
});

const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello, world!' }]
});
console.log(response.choices[0].message.content);

Requires openai >= 4.0. Install with npm install openai.

cURL

Test directly from your terminal. Pass the composite key in the Authorization header.

Shell
curl https://api.lexisaas.com/v1/chat/completions \
  -H "Authorization: Bearer lx_live_your_key_here:sk-your-openai-key" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hello"}]}'

Anthropic

Use the official Anthropic Python SDK with a custom base URL. The composite key goes in the api_key parameter.

Python
import anthropic

client = anthropic.Anthropic(
    base_url="https://api.lexisaas.com/v1",
    api_key="lx_live_your_key_here:sk-ant-your-anthropic-key",
)

message = client.messages.create(
    model="claude-sonnet-4-5-20250929",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello, world!"}]
)
print(message.content[0].text)
JavaScript
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
  baseURL: 'https://api.lexisaas.com/v1',
  apiKey: 'lx_live_your_key_here:sk-ant-your-anthropic-key',
});

const message = await client.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Hello, world!' }]
});
console.log(message.content[0].text);

The Anthropic SDK sends the key via x-api-key header. LEXI supports the composite format in both Authorization and x-api-key headers automatically.

Google Gemini, xAI Grok, DeepSeek, Meta

LEXI auto-detects the provider from the model name. Just use the model name you normally would. All providers use the same OpenAI-compatible endpoint.

Python — Google Gemini
from openai import OpenAI

client = OpenAI(
    base_url="https://api.lexisaas.com/v1",
    api_key="lx_live_your_key_here:your-google-api-key"
)

response = client.chat.completions.create(
    model="gemini-2.5-pro",  # Auto-routes to Google
    messages=[{"role": "user", "content": "Hello!"}]
)
cURL — xAI Grok
curl https://api.lexisaas.com/v1/chat/completions \
  -H "Authorization: Bearer lx_live_your_key_here:xai-your-grok-key" \
  -H "Content-Type: application/json" \
  -d '{"model":"grok-3","messages":[{"role":"user","content":"Hello"}]}'
JavaScript — DeepSeek
import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.lexisaas.com/v1',
  apiKey: 'lx_live_your_key_here:your-deepseek-key'
});

const response = await client.chat.completions.create({
  model: 'deepseek-chat',  // Auto-routes to DeepSeek
  messages: [{ role: 'user', content: 'Hello!' }]
});

LEXI automatically detects your LLM provider from the model name. Simply use your standard model identifiers — no configuration needed. Your provider API key is embedded in the composite key after the colon.

Response Headers

LEXI adds diagnostic headers to every response so you can monitor compression performance and debug issues.

Header Description
X-Lexi-Tokens-Saved Number of tokens saved by STONE compression on this request.
X-Lexi-Compression-Ratio Compression ratio achieved, from 0.0 (no compression) to 1.0 (maximum).
X-Lexi-Fallback Set to "true" if compression was bypassed and the raw request was forwarded.
X-Request-Id Unique request identifier. Include this when contacting support.
X-RateLimit-Limit Your rate limit cap (requests per minute) for the current billing tier.
X-RateLimit-Remaining Requests remaining in the current rate limit window.

Rate Limits

Rate limits are applied per API key. When exceeded, requests return 429 Too Many Requests with a Retry-After header.

Lifetime Spend Requests / min API Keys
$0 (free credits) 30 10
$5+ 100 100
$50+ 500 100
$500+ 2,000 100
$1,000+ 5,000 100

Supported Models

LEXI works with any model available through the supported providers. Just use the model name you normally would — LEXI auto-routes to the right provider based on the model name.

OpenAI (10 models)

  • gpt-5 / gpt-5.1 / gpt-5.2
  • gpt-4o
  • gpt-4.1 / gpt-4.1-mini / gpt-4.1-nano
  • o3 / o3-mini / o4-mini

Anthropic (3 models)

  • claude-opus-4-6
  • claude-sonnet-4-5-20250929
  • claude-haiku-4-5-20251001

Google Gemini (5 models)

  • gemini-3-pro / gemini-3-flash
  • gemini-2.5-pro / gemini-2.5-flash
  • gemini-2.0-flash

xAI Grok (4 models)

  • grok-4 / grok-4.1-fast
  • grok-3 / grok-3-mini

DeepSeek (2 models)

  • deepseek-chat
  • deepseek-reasoner

Meta (2 models)

  • llama-4
  • llama-3.3-70b

Free passthrough on 9 models

Models priced under $0.50/1M input tokens are proxied at no LEXI fee — you only pay the provider cost. Free passthrough models: gpt-4.1-mini, gpt-4.1-nano, gpt-4o-mini, gemini-3-flash, gemini-2.5-flash, gemini-2.0-flash, deepseek-chat, llama-4, llama-3.3-70b.

Frequently Asked Questions

Common questions about how LEXI Cloud works, data handling, and compatibility.

LEXI processes your messages to compress them, but does not store conversation content beyond the active session. Sessions are tenant-isolated — no data crosses account boundaries. Once the session ends, the compressed context is discarded.
LEXI automatically falls back to forwarding your raw request unmodified. The X-Lexi-Fallback response header is set to "true" when this happens. You still get a valid response from the upstream provider — you just don't get the token savings on that particular request.
Caching returns stored responses for repeated queries. LEXI compresses the input context sent to the LLM, so you always get a fresh, unique response. This means LEXI works on novel queries, multi-turn conversations, and dynamic content where caching provides no benefit.
Azure OpenAI support is coming soon. If you need it urgently, reach out to us at support@lexisaas.com and we can prioritize your account.