Bee Cell
bee-cell · 128K context · cutoff 2026-03
Input
$0.15 / 1M
Output
$0.60 / 1M
Best for
- Solo developers
- Mobile / on-device chat
- Local-first workflows
Capabilities
Bee Models
Cell → Brood → Comb → Buzz → Hive → Swarm → Enclave. Each tier serves a defined role across capability, usage, governance, and deployment control — from public access to sovereign deployment. Bee Cell is live today; the higher tiers land behind the governed-release pipeline as each clears the eval harness. Each tier ships on a curated open-weight production base; specific bases are auditable via the Enclave deployment manifest under NDA. Bee Ignite is internal R&D, not customer-selectable.
Live serverless inference, OpenAI Chat Completions compatible. The Bee Cell production base ships on a curated open-weight Apache-2.0 release under our governed release policy; specific base disclosure is contractual via the Enclave deployment manifest.
bee-cell · 128K context · cutoff 2026-03
Input
$0.15 / 1M
Output
$0.60 / 1M
Best for
Capabilities
Roadmap tiers are in final training and validation. Each tier lands behind its own Live badge once the trained checkpoint clears the eval harness. Evidence and validation criteria are published at /trust.
bee-brood · 256K context · target pricing $0.30 / $1.20 per 1M tok
base · Bee Brood reasoning base
Premium reasoning tier. Depends on the governed-release pipeline that lands each capability behind the eval harness.
bee-comb · 256K context · target pricing $0.50 / $1.50 per 1M tok
base · Bee Comb production base
Builder + production workflow tier. Specialised reasoning for coding, automation, API applications, and technical workflows. First domain adapter is in final validation.
bee-buzz · 256K context · target pricing $1.00 / $3.00 per 1M tok
base · Bee Buzz agent base
Team + agent workflow tier. Sized for collaboration, tool use, agents, pooled usage, internal tools, and operational workflows.
bee-hive · 256K context · target pricing $2.00 / $8.00 per 1M tok
base · Bee Hive multi-base ensemble
Enterprise specialist intelligence tier. High-capability specialised reasoning for enterprise analysis, technical depth, regulated-domain preparation, and knowledge-intensive workflows.
bee-swarm · 1M context · target pricing $5.00 / $15.00 per 1M tok
base · Bee Swarm routed fabric (frontier-class)
High-assurance advanced reasoning tier. Premium intelligence for complex reasoning, research workflows, multi-agent coordination, and mission-critical analysis.
Under the hood
The seven customer tiers (Cell, Brood, Comb, Buzz, Hive, Swarm, Enclave) share the same engine design. The numbers below describe the architectural design target. Bee Cell ships today on a curated open-weight base under our governed release policy while the rest of the ladder lands tier by tier through the eval-gated rollout.
Capabilities
Difficulty-scored routing between local execution and frontier-teacher escalation. Scoring blends keyword complexity, query length, conversation depth, code/math detection, and a per-domain multiplier.
source · bee/adaptive_router.py
The evolution loop generates candidate neural modules — attention variants, SSM discretisations, compression codecs, memory protocols — runs them through a sandboxed eval, and only accepts winners. Implemented in bee/evolution.py + bee/invention_engine.py; not currently running against the live Bee Cell deployment.
source · bee/evolution.py · bee/invention_engine.py
When a request needs code, the self-coding module is designed to write it, execute in a sandbox, read the error, and iterate. Activates on roadmap tiers as each clears the eval harness.
source · bee/self_coding.py · bee/self_heal.py
Document upload → chunk → embed → FAISS index → cite. End-to-end retrieval is built into every tier; pass document IDs in the chat request and Bee handles the rest.
source · bee/rag* · bee/data_engine.py
Real qiskit-ibm-runtime integration to IBM Heron r2 hardware, with a local statevector simulator fallback. The integration is real code in the repo — every API call is NOT routed through quantum today; it activates per-request on roadmap tiers as they ship.
source · bee/quantum_reasoning.py · bee/quantum_ibm.py
LoRA adapters specialise the base model for low-cost fine-tuning per domain. Adapters are released through the governed-release pipeline once each clears the eval harness — every released adapter has a published validation record at /trust.
source · Per-adapter validation records published at /trust
Internal eval suite
The Cell base (google/gemma-4-E4B-it) on our 40-task internal suite, run on Apple Silicon (MPS). Every score traces to the committed raw prompts + outputs in data/eval_reports/report.json — reproduce with python -m bee.eval_harness --device mps. A small internal suite on the base model (pre-adapter), not a comparative public benchmark — the raw outputs even show where the strict grader is over-strict.
Overall
70.0%
| Benchmark | Score | Passed | Avg latency |
|---|---|---|---|
| Coding | 60% | 6 / 10 | 6740 ms |
| Reasoning | 60% | 6 / 10 | 449 ms |
| Instruction following | 90% | 9 / 10 | 1445 ms |
| Grounded factual | 40% | 2 / 5 | 718 ms |
| Domain (specialised) | 100% | 5 / 5 | 2168 ms |
source · bee/eval_harness.py · data/eval_reports/report.json · google/gemma-4-E4B-it (7941M params) · 100.8s
API compatibility
Bee speaks the same wire protocol as OpenAI's /chat/completions. Switch the base URL and the API key — your existing client code keeps working. Tools, JSON mode, structured outputs, and streaming all supported.
bee-enclave · contracted · cutoff 2026-03
Run any Hive- or Swarm-class workload in your private VPC, regulated, or air-gapped environment. Same models, different deployment posture.
bee-ignite · CUI Labs internal · not commercially available
Bee Ignite is the experimental Bee-native architecture: MoE, SSM memory, neural compression, distillation, and quantum-assisted modules. Findings backflow into production tiers but Ignite itself is not user-selectable.