Part of the Koder AI platform

Meet Zoo

The habitat for AI models. Publish, version and discover model weights, datasets and interactive demos — with on-demand inference, sandboxed Spaces and signed leaderboards, all in one place.

# Push a model, pull weights, run inference
from koder_zoo import Client

zoo = Client.login()

# Publish a model with weights and a card
zoo.models.push(
  "acme/helios-7b",
  weights="./out/weights.safetensors",
  card="./MODEL_CARD.md",
)

# Pull someone else's model
model = zoo.models.pull("koder/aurora-base")

# Run inference on the hosted endpoint
out = zoo.inference.run(
  "koder/aurora-base",
  prompt="Describe a monstera leaf",
  stream=True,
)

Everything That Lives in the Zoo

Weights, data, demos and evaluations — four first-class inhabitants sharing the same registry, access control and storage layer.

Models

Versioned model weights with rich metadata, signed provenance and content-addressed storage. Push via the Git LFS protocol or the Python SDK.

safetensorsggufonnx

Datasets

Schema-aware datasets with train/val/test splits, row-level preview and streaming download. Reuse the same storage engine as models.

parquetsplitsstreaming

Model Cards

Structured docs covering intended use, limitations, training data, evaluation metrics and ethical considerations — rendered from markdown with frontmatter.

markdownschemasigned

Spaces

Interactive demos running in sandboxed Firecracker microVMs — one permanent URL per Space, cold-started on request and frozen while idle.

Firecrackersandboxedper-URL

Inference Endpoints

On-demand, streaming and batched inference for any hosted model. Integrated with gateway quotas and usage-based billing out of the box.

streamingbatchedquotas

Leaderboards

Task-scoped benchmarks computed by Koder Eval and published as signed attestations. Rank models per metric across common tasks.

MMLUHumanEvalattested

Built for Serious AI Workloads

A registry that scales from a single notebook experiment to tenant-wide production models — without changing tools.

Git LFS Protocol

Push and pull weights with the familiar Git LFS protocol. Any existing tooling that speaks LFS works unchanged.

Content-Addressed

Every blob is addressed by its hash, deduplicated across versions and tenants. Rollbacks are instant.

SSO by Koder ID

Access control is delegated to Koder ID. Public, private and org-scoped repositories with fine-grained roles.

Signed Provenance

Every push carries a signed attestation — who uploaded what, when, from which runtime. Tamper-evident by design.

S3-Compatible

Blobs live in a MinIO cluster you can back up, replicate and scale independently from metadata.

Quotas & Billing

Inference usage flows straight into Koder Billing. Per-tenant quotas enforced by the gateway in front of every endpoint.

Sandboxed Spaces

Each Space runs in its own Firecracker microVM, with strict CPU, memory and egress limits.

Evaluation-Aware

Leaderboard results come from Koder Eval as signed attestations, not self-reported numbers.

Push, Pull, Inference — in Three Lines

Everything in Koder Zoo is scriptable. The SDK and the Git LFS protocol give you two paths to the same store.

Publish a model with one call

Upload weights, the model card and a manifest in a single operation. Koder Zoo hashes the blobs, deduplicates against existing versions and signs the manifest.

  • Safetensors, GGUF and ONNX supported natively
  • Card rendered automatically from markdown
  • Signed manifests with platform attestations
  • Automatic content deduplication across versions
# Publish a model
from koder_zoo import Client

zoo = Client.login()

zoo.models.push(
  repo="acme/helios-7b",
  version="v0.3.0",
  weights=[
    "out/weights-00001-of-00002.safetensors",
    "out/weights-00002-of-00002.safetensors",
  ],
  card="MODEL_CARD.md",
  tags=["text-generation", "en", "apache-2.0"],
)

Stream inference with quotas

Run inference on any hosted endpoint. Requests go through the Koder AI gateway, which enforces quotas and bills the calling tenant automatically.

  • Streaming and batched inference
  • Per-tenant quotas and rate limits
  • Usage metering sent to Koder Billing
  • Same API across every hosted model
# Stream tokens from a hosted model
for chunk in zoo.inference.stream(
  model="koder/aurora-base",
  prompt="Write a haiku about monstera leaves",
  max_tokens=128,
  temperature=0.7,
):
  print(chunk.text, end="", flush=True)

Layered Architecture

Every layer is independently scalable. Metadata in KDB, blobs in MinIO, Spaces in Firecracker microVMs, inference routed by the Koder AI gateway.

Console  ·  Flutter Web · Catalog · Model Cards · Playground
API Layer  ·  Registry · Datasets · Spaces Scheduler · Inference Router
Protocol  ·  Git LFS · REST · gRPC streaming · OIDC (Koder ID)
Storage  ·  KDB (metadata) · MinIO (blobs) · KDB time-series (usage)
Runtime  ·  Firecracker microVMs · containerd · Koder AI Gateway

How It Compares

Koder Zoo is a private-first, tenant-scoped model hub for the Koder AI platform. Here is how it stacks up against common alternatives.

Capability Koder Zoo Hugging Face S3 + Git LFS MLflow
Versioned model weights
Dataset hub with previews
Sandboxed interactive Spaces
On-demand streaming inference
Signed leaderboards
Signed provenance on every push
Tenant-scoped quotas & billing
Private-first, self-hostable
SSO via central identity

Ready to open the habitat?

Start publishing models, datasets and Spaces on the Koder AI platform. Get Started takes you to the Koder ID login, no credit card required.

Get Started