Part of the Koder AI platform

Meet Zoo

The habitat for AI models. Publish, version and discover model weights, datasets and interactive demos — with on-demand inference, sandboxed Spaces and signed leaderboards, all in one place.

Get Started Explore the zoo

# Push a model, pull weights, run inference
from koder_zoo import Client

zoo = Client.login()

# Publish a model with weights and a card
zoo.models.push(
  "acme/helios-7b",
  weights="./out/weights.safetensors",
  card="./MODEL_CARD.md",
)

# Pull someone else's model
model = zoo.models.pull("koder/aurora-base")

# Run inference on the hosted endpoint
out = zoo.inference.run(
  "koder/aurora-base",
  prompt="Describe a monstera leaf",
  stream=True,
)

Everything That Lives in the Zoo

Weights, data, demos and evaluations — four first-class inhabitants sharing the same registry, access control and storage layer.

●

Models

Versioned model weights with rich metadata, signed provenance and content-addressed storage. Push via the Git LFS protocol or the Python SDK.

safetensorsggufonnx

▦

Datasets

Schema-aware datasets with train/val/test splits, row-level preview and streaming download. Reuse the same storage engine as models.

parquetsplitsstreaming

☰

Model Cards

Structured docs covering intended use, limitations, training data, evaluation metrics and ethical considerations — rendered from markdown with frontmatter.

markdownschemasigned

★

Spaces

Interactive demos running in sandboxed Firecracker microVMs — one permanent URL per Space, cold-started on request and frozen while idle.

Firecrackersandboxedper-URL

◆

Inference Endpoints

On-demand, streaming and batched inference for any hosted model. Integrated with gateway quotas and usage-based billing out of the box.

streamingbatchedquotas

♣

Leaderboards

Task-scoped benchmarks computed by Koder Eval and published as signed attestations. Rank models per metric across common tasks.

MMLUHumanEvalattested

Built for Serious AI Workloads

A registry that scales from a single notebook experiment to tenant-wide production models — without changing tools.

☰ Git LFS Protocol

Push and pull weights with the familiar Git LFS protocol. Any existing tooling that speaks LFS works unchanged.

● Content-Addressed

Every blob is addressed by its hash, deduplicated across versions and tenants. Rollbacks are instant.

★ SSO by Koder ID

Access control is delegated to Koder ID. Public, private and org-scoped repositories with fine-grained roles.

◆ Signed Provenance

Every push carries a signed attestation — who uploaded what, when, from which runtime. Tamper-evident by design.

▦ S3-Compatible

Blobs live in a MinIO cluster you can back up, replicate and scale independently from metadata.

◉ Quotas & Billing

Inference usage flows straight into Koder Billing. Per-tenant quotas enforced by the gateway in front of every endpoint.

♣ Sandboxed Spaces

Each Space runs in its own Firecracker microVM, with strict CPU, memory and egress limits.

✎ Evaluation-Aware

Leaderboard results come from Koder Eval as signed attestations, not self-reported numbers.

Push, Pull, Inference — in Three Lines

Everything in Koder Zoo is scriptable. The SDK and the Git LFS protocol give you two paths to the same store.

Publish a model with one call

Upload weights, the model card and a manifest in a single operation. Koder Zoo hashes the blobs, deduplicates against existing versions and signs the manifest.

Safetensors, GGUF and ONNX supported natively
Card rendered automatically from markdown
Signed manifests with platform attestations
Automatic content deduplication across versions

# Publish a model
from koder_zoo import Client

zoo = Client.login()

zoo.models.push(
  repo="acme/helios-7b",
  version="v0.3.0",
  weights=[
    "out/weights-00001-of-00002.safetensors",
    "out/weights-00002-of-00002.safetensors",
  ],
  card="MODEL_CARD.md",
  tags=["text-generation", "en", "apache-2.0"],
)

Stream inference with quotas

Run inference on any hosted endpoint. Requests go through the Koder AI gateway, which enforces quotas and bills the calling tenant automatically.

Streaming and batched inference
Per-tenant quotas and rate limits
Usage metering sent to Koder Billing
Same API across every hosted model

# Stream tokens from a hosted model
for chunk in zoo.inference.stream(
  model="koder/aurora-base",
  prompt="Write a haiku about monstera leaves",
  max_tokens=128,
  temperature=0.7,
):
  print(chunk.text, end="", flush=True)

Layered Architecture

Every layer is independently scalable. Metadata in KDB, blobs in MinIO, Spaces in Firecracker microVMs, inference routed by the Koder AI gateway.

Console · Flutter Web · Catalog · Model Cards · Playground

API Layer · Registry · Datasets · Spaces Scheduler · Inference Router

Protocol · Git LFS · REST · gRPC streaming · OIDC (Koder ID)

Storage · KDB (metadata) · MinIO (blobs) · KDB time-series (usage)

Runtime · Firecracker microVMs · containerd · Koder AI Gateway

How It Compares

Koder Zoo is a private-first, tenant-scoped model hub for the Koder AI platform. Here is how it stacks up against common alternatives.

Capability	Koder Zoo	Hugging Face	S3 + Git LFS	MLflow
Versioned model weights	✓	✓	✓	✓
Dataset hub with previews	✓	✓	—	—
Sandboxed interactive Spaces	✓	✓	—	—
On-demand streaming inference	✓	✓	—	—
Signed leaderboards	✓	—	—	—
Signed provenance on every push	✓	—	—	—
Tenant-scoped quotas & billing	✓	—	—	—
Private-first, self-hostable	✓	—	✓	✓
SSO via central identity	✓	—	—	—