Docs
Build with Bee.
Bee exposes an OpenAI-compatible Chat Completions API. Drop-in compatible with any client that already speaks OpenAI; switch the base URL and key, and you're shipping.
Quickstart
Three steps. Sign up, mint a key, send your first request.
- 1. Open a workspace — free tier, no card required.
- 2. Generate an API key under Settings → API keys.
- 3. Send your first request:
curl https://api.bee.cuilabs.io/chat/completions \
-H "Authorization: Bearer $BEE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "bee-comb",
"messages": [
{ "role": "user", "content": "Explain post-quantum cryptography in one paragraph." }
]
}'Authentication
Bearer tokens. Pass your API key in the Authorization header.
Authorization: Bearer sk-bee-...
Public-tier requests run over standard TLS 1.3. Bee Enclave Sovereign customer transport runs over the QNSP post-quantum stack (FIPS 203 ML-KEM + FIPS 204 ML-DSA); see /security for the rollout posture.
Chat completions
Same shape as OpenAI's /chat/completions — see the request schema below. Tools, JSON mode, and structured outputs are supported on all production models.
{
"model": "bee-hive",
"messages": [
{ "role": "system", "content": "You are a senior security engineer." },
{ "role": "user", "content": "Audit this Kyber key exchange." }
],
"temperature": 0.7,
"max_tokens": 1024,
"stream": false,
"tools": [/* function-calling spec */],
"response_format": { "type": "json_object" }
}{
"id": "cmpl_01HZ...",
"object": "chat.completion",
"model": "bee-hive",
"choices": [
{
"index": 0,
"message": { "role": "assistant", "content": "..." },
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 42,
"completion_tokens": 287,
"total_tokens": 329
}
}Models
Six production tiers. The adaptive router picks the right size automatically when you specify model: "bee"; pin a tier explicitly to control cost and latency.
- bee-cell128K context · $0.15 / $0.60
- bee-brood256K context · $0.30 / $1.20
- bee-comb256K context · $0.50 / $1.50
- bee-buzz256K context · $1.00 / $3.00
- bee-hive256K context · $2.00 / $8.00
- bee-swarm1M context · $5.00 / $15.00
Prices are USD per 1M tokens (input / output). Full lineup on the models page.
RAG (retrieval)
Upload documents and let Bee retrieve relevant chunks at inference time. Vector store is FAISS; embedding model is all-MiniLM-L6-v2 (384-dim).
# Upload (multipart)
curl https://api.bee.cuilabs.io/documents \
-H "Authorization: Bearer $BEE_API_KEY" \
-F "file=@whitepaper.pdf"
# Retrieve at inference time — pass document_ids in the chat request:
{
"model": "bee-hive",
"messages": [...],
"documents": ["doc_abc123", "doc_def456"]
}Streaming
Set stream: true. Responses arrive as Server-Sent Events with the same delta format OpenAI uses.
data: {"choices":[{"delta":{"content":"Post-"}}]}
data: {"choices":[{"delta":{"content":"quantum"}}]}
data: {"choices":[{"delta":{"content":" cryptography"}}]}
data: [DONE]Errors
Standard HTTP status codes. Body is { error: { code, message } }.
- 400 — invalid request (schema)
- 401 — missing or invalid API key
- 402 — pool exhausted on hard-limited plan
- 429 — rate limited; honour Retry-After
- 500 — engine fault; safe to retry
Rate limits
Per-key, per-minute. Limits scale with plan tier; see pricing for the per-plan token allowance.
Every response includes:
X-Bee-RateLimit-Limit: 1000 X-Bee-RateLimit-Remaining: 873 X-Bee-RateLimit-Reset: 1714867200 X-Bee-Pool-Remaining: 9842917
More docs are on the way. SDK reference (Python, Node, Rust, Go), tool-use cookbook, and migration guides from OpenAI / Anthropic ship as we cut their APIs over.
Need something now? Ask support.