Skip to main content

Cut AI API costs.
Extend conversations.
One URL change.

Drop-in proxy for OpenAI, Anthropic, Google, xAI, DeepSeek and Meta. Session-aware compression that reduces token costs as conversations grow.

# That's it. Your existing code just works.
export OPENAI_BASE_URL=https://api.lexisaas.com/v1
33

Models supported

7

Providers, one endpoint

$0.50

Per 1M tokens, flat

1 line

To integrate

Get started in minutes

No SDK. No code rewrite. Just one line.

Create an account

Sign up and get your API key. 5 million tokens free. No credit card required.

Change one line

Point your client to api.lexisaas.com/v1 — that's it.

Save on every call

Compression kicks in automatically. The longer the conversation, the more you save.

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    base_url="https://api.lexisaas.com/v1",
    api_key="your-lexi-key"
)

# Everything else stays exactly the same
response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages
)

JavaScript / TypeScript

import OpenAI from "openai";

const client = new OpenAI({
    baseURL: "https://api.lexisaas.com/v1",
    apiKey: "your-lexi-key"
});

// Your existing code. No changes.
const response = await client.chat.completions.create({
    model: "gpt-4o",
    messages
});

Calculate your savings

See exactly what LEXI saves you.

1M 50M 500M
50% 60% 90%

Compression increases with conversation length.

Without LEXI

$125.00

/ month

With LEXI

$75.00

/ month

You save

$50.00

40% savings

Direct API cost With LEXI compression

LEXI fee: $0.50 per 1M tokens processed. Compression rate scales with conversation length.

Simple, transparent pricing

$0.50 per 1M tokens. 5 million tokens free to start.

$0.50
per 1M tokens

One flat rate. No tiers. No overage fees.

5M tokens
free on signup

$2.50 credit. No credit card required.

Your keys
bring your own

Use your existing API keys from any provider. We never store or access your keys beyond proxying.

Rate limits grow with you

Spend more, get more. Automatically.

Lifetime Spend Rate Limit
$0 (free credits)120 req/min
$5+2,000 req/min
$50+10,000 req/min
$500+30,000 req/min
$1,000+60,000 req/min

Need more? Contact us for custom limits.

How it works under the hood

Not a prompt hack. A proprietary compression architecture that learns your conversation and removes redundancy in real-time.

Session-Aware Compression

Each session builds a semantic profile of your conversation. The system identifies what the model already knows and removes redundancy, sending only what matters for each turn.

Adaptive Activation

Compression only activates when it saves tokens. The system compares compressed vs. original on every turn. If compression wouldn't help, your request passes through unmodified.

Constant Resource Usage

Memory and compute stay flat regardless of conversation length or session count. Your costs are predictable and bounded. No surprises at scale.

Intelligent Context Recall

A proprietary indexing system retrieves relevant context from past interactions instantly. The system learns what matters over time. Context persists across sessions.

Compression scales with conversation length

From internal testing. Actual compression varies by message content and topic.

Start saving on AI costs today

5 million tokens free on signup. No credit card required. One URL change and your existing code just works.