Qwen3.5 is the latest from Alibaba Cloud, surpassing Qwen3-235B on benchmarks with much smaller models. 256K context, 201 languages, thinking + non-thinking modes. The 35B-A3B MoE variant uses only 3B active params for fast inference.
Deploy Qwen3.5 in minutes
Starting at $0.53/hr on dedicated GPU
| Model | GPU | VRAM | Price | Action |
|---|---|---|---|---|
Qwen3.5 4B Small (4B) | L4 | 24 GB | $0.53/hr | Deploy |
Qwen3.5 9B 9B (Recommended) | L4 | 24 GB | $0.53/hr | Deploy |
Qwen3.5 27B Large (27B) | RTX A6000 | 48 GB | $0.66/hr | Deploy |
Qwen3.5 35B-A3B MoE MoE (35B-A3B) | RTX A6000 | 48 GB | $0.66/hr | Deploy |
Prices include 30% service fee. Billed per minute while running.
Pick your GPU and have it running in minutes. No infrastructure setup required.