Qwen3 is Alibaba Cloud's latest language model family supporting 119 languages with 128K context. Features dual thinking/non-thinking modes for flexible reasoning depth. The 8B variant has over 18 million Ollama pulls.
Deploy Qwen3 in minutes
Starting at $0.51/hr on dedicated GPU
| Model | GPU | VRAM | Price | Action |
|---|---|---|---|---|
Qwen3 4B Small (4B) | L4 | 24 GB | $0.51/hr | Deploy |
Qwen3 8B 8B (Recommended) | L4 | 24 GB | $0.51/hr | Deploy |
Qwen3 14B Medium (14B, Recommended) | L4 | 24 GB | $0.51/hr | Deploy |
Qwen3 32B Large (32B) | RTX A6000 | 48 GB | $0.64/hr | Deploy |
Qwen3 30B-A3B MoE MoE (30B-A3B) | L4 | 24 GB | $0.51/hr | Deploy |
Prices include 30% service fee. Billed per minute while running.
Pick your GPU and have it running in minutes. No infrastructure setup required.