Question 1

How does billing work?

Accepted Answer

Dedicated GPUs are billed per 10-minute increment — you only pay while your instance is running. Serverless is billed per request with no idle costs. No monthly minimums for either.

Question 2

Do I need a credit card to start?

Accepted Answer

You can try the free demo without an account. To deploy your own GPU, sign up and add prepaid credits starting at $5.

Question 3

What if I forget to stop my GPU instance?

Accepted Answer

Deployments don't auto-stop when idle — you stop them manually from the dashboard. As a safety net we do auto-stop when your credit balance reaches $0, so you'll never be charged beyond your prepaid balance.

Question 4

Can I use both dedicated GPU and serverless?

Accepted Answer

Yes. Many teams use serverless for development and burst traffic, then switch to dedicated GPUs for sustained production workloads.

Question 5

What's included in the GPU price?

Accepted Answer

Everything: GPU, vCPU, RAM, disk, and network. No hidden fees, no egress charges, no surprise bills.

Question 6

Which mode should I choose?

Accepted Answer

Use serverless for low-volume, bursty, or development workloads (pay only when you generate). Use dedicated GPU for sustained production traffic where you need consistent latency and throughput.

GPU	VRAM	vCPU	RAM	Price/hr	Best For
CPU Only	—	3	16 GB	$0.15/hour	Small LLMs (1B-3B), basic inference, API servers, testing
RTX 4090	24 GB	6	41 GB	$0.92/hour	7B-13B LLMs, SDXL Images, Flux, Most Popular GPU
L4	24 GB	12	50 GB	$0.53/hour	7B-13B LLMs, SDXL Images, Power Efficient
RTX A6000	48 GB	9	50 GB	$0.66/hour	30B+ LLMs (quantized 70B), Video Gen, 48GB VRAM
A100 80GB	80 GB	8	117 GB	$1.85/hour	70B+ LLMs, Complex Workflows, Fine-tuning
H100	80 GB	16	188 GB	$4.32/hour	Lowest Latency Production, Training, Research

Model	Per Request
Qwen3 4BQuick Response	$0.010
Qwen3 8BGeneral Purpose	$0.015
DeepSeek R1 8BReasoning	$0.015
GLM-Z1 9BBilingual Reasoning	$0.015
DeepSeek R1 14BAdvanced Reasoning	$0.030
Qwen3 32BPremium Quality	$0.050
GLM-Z1 32BPremium Reasoning	$0.050

Model	Per Request
Flux Schnell	$0.008
Flux Dev	$0.015
Stable Diffusion XL	$0.005
Z-Image Turbo	$0.008

Simple, Transparent Pricing

Dedicated GPU

Serverless

Dedicated GPU Instances

CPU Only

RTX 4090

L4

RTX A6000

A100 80GB

H100

Serverless — Pay Per Request

Text Models

Image Models

Estimate Your Costs

Get Started Free

Frequently Asked Questions

Ready to deploy?Start building with AI models today.