AI Cloud: build on GPUs your way

VMs, Kubernetes, and Slurm for AI training and inference on high-performance GPU infrastructure.

Quickstarts

Get hands-on with Nebius AI Cloud and run your first GPU workload in minutes.

Get started with Nebius AI Cloud

Step-by-step guide to launch and run AI workloads on Nebius GPU VMs. Ideal for experiments, model testing, and rapid iteration.

Understand the AI Cloud architecture

Overview of GPU VMs, Managed Kubernetes clusters, Slurm with Soperator, networking, and storage. Learn how the Nebius AI Cloud platform is structured for AI training and inference.

Nebius AI Cloud Console overview

Learn how to provision GPU infrastructure, manage Kubernetes clusters, launch Slurm jobs, and monitor workloads from the console.

Resources and guides

Deep dives, tutorials, and reference architectures to help you build and scale real AI systems on GPU cloud infrastructure.

Data preparation techniques

Best practices and practical pipelines for preparing large datasets for LLM training and distributed AI workloads on GPU clusters.

Inference guide with vLLM

Deploy high throughput LLM inference on Nebius GPU cloud using vLLM and Kubernetes. Includes configuration patterns for scalable model serving.

K8s on Nebius

Deploy GPU enabled Kubernetes clusters with pre installed NVIDIA drivers and optimized networking for AI training and inference workloads.

Soperator

Managed Slurm clusters for large scale distributed AI training. Launch and manage high performance GPU training jobs in minutes.

SkyPilot — Finetuning & Orchestrating

Orchestrate multi node LLM fine tuning and distributed training across GPU VMs and Kubernetes clusters using SkyPilot.

More guides for SkyPilot

Deep dives, tutorials, and reference implementations to help you build production grade LLM training workflows.

Orchestrating LLM fine-tuning on K8s with SkyPilot and MLflow

Manage distributed fine tuning workloads with experiment tracking on Kubernetes GPU clusters.

Using SkyPilot and Kubernetes for multi-node fine-tuning of Llama 3.1

Step by step guide to running distributed LLM fine tuning on Nebius AI Cloud GPU infrastructure.

Reference Implementations Example: Healthcare

Reference implementations for healthcare and life sciences AI workloads running on Nebius GPU cloud.

Running NVIDIA NIM and NVIDIA Blueprint in Nebius AI Cloud

Deploy validated healthcare AI models using NVIDIA NIM on high performance GPU infrastructure.

Running Boltz-2 inference at scale in Nebius AI Cloud

Scale life sciences inference workloads on GPU clusters optimized for performance and throughput.

How to run Boltz-2 at scale on Kubernetes

Deploy a Kubernetes cluster, configure shared storage, and run reproducible, multi-node Boltz-2 inference for protein.

Explore more ways to learn and get support

Get help and connect with other builders.

Subscribe to our newsletter

Get builder updates: new releases, cookbooks, events, and more