Nebius Blueprints

Production-ready reference architectures for building agents on Nebius AI Cloud.

Each blueprint is a validated, composable stack you can run in under 5 minutes and adapt to your workload.

Agents Blueprint

Agent failures aren’t model problems, they’re system problems: lack of live grounding, wrong retrieval, missing observability, inference costs that don’t survive scale. The Nebius Agents Blueprint is an open reference architecture that connects proven components. Every component is independently deployable and replaceable.

Build, run, and continuously improve AI agents in production. Open at every layer, no lock-in.

Blueprint recipes

Runnable guides, one capability at a time.

Start here

GitHub repository

Complete end-to-end code implementations.

Clone and run

Architecture

Nebius Token Factory is the agent runtime. Every other component plugs in around it.

Agent Inference and Runtime — Nebius Token Factory

Dedicated endpoints, autoscaling, OpenAI-compatible API. 60+ open models.

Orchestration — LangChain Deep Agents

Multi-step workflows, persistent state, MCP-compatible tool connections.

Observability — LangSmith

Full execution traceability — every prompt, tool call, and retrieval step recorded.

Retrieval — Pinecone

Structured knowledge instead of raw documents. Every chunk source-traceable.

Grounding — Tavily by Nebius

Real-time web retrieval with source reliability filtering.

Simulation — Snowglobe by Guardrails AI

Synthetic testing before launch. Produces a labeled eval dataset, fine-tuning data, and a QA regression suite from the same runs.

Blueprint Recipes

Runnable recipes from first agent to production. Start at 01 or jump to the capability you need. All recipes include Token Factory as the inference layer.

Level

Title

Time

Beginner

Your First Agent on Nebius

8 min

Intermediate

Domain Knowledge with Pinecone Nexus

13 min

Intermediate

Real-Time Data with Tavily

12 min

Intermediate

Stronger Agents with LangChain and LangGraph

13 min

Intermediate

Short-Term Memory with LangChain

12 min

Advanced

Long-Term Memory with LangChain and Postgres

13 min

Advanced

Observability with LangSmith

14 min

Advanced

Adding Guardrails with LangChain

14 min

Advanced

Making Actions with MCP and Stripe

18 min

Advanced

Testing Before Production with Snowglobe

14 min

Case study: From prototype to production

What does it take to turn an AI agent into a production-ready system?

To find out, we built a compliance audit agent and applied Blueprint Recipes across grounding, orchestration, evaluation, and governance. We then benchmarked each stage to measure the impact on quality and cost.

The production agent achieved 20% higher precision while reducing cost by 72% compared to the prototype, with identical recall.

Read the full case study

Resources

Agents Blueprint Recipes

runnable guides from first agent to production

Agents Blueprint github repository

Complete end-to-end code implementations including the compliance agent. Clone and run.

Token Factory Cookbook

Broader reference library of Token Factory use cases and partner integrations

Token Factory docs

API reference, dedicated endpoints, autoscaling