Skip to main content

Deploy Qwen3.5

Text & Chat

Qwen3.5 is the latest from Alibaba Cloud, surpassing Qwen3-235B on benchmarks with much smaller models. 256K context, 201 languages, thinking + non-thinking modes. The 35B-A3B MoE variant uses only 3B active params for fast inference.

Deploy Qwen3.5 in minutes

Starting at $0.53/hr on dedicated GPU

Available Variants (4)

ModelGPUVRAMPriceAction
Qwen3.5 4B
Small (4B)
L424 GB$0.53/hrDeploy
Qwen3.5 9B
9B (Recommended)
L424 GB$0.53/hrDeploy
Qwen3.5 27B
Large (27B)
RTX A600048 GB$0.66/hrDeploy
Qwen3.5 35B-A3B MoE
MoE (35B-A3B)
RTX A600048 GB$0.66/hrDeploy

Prices include 30% service fee. Billed per minute while running.

Includes OpenWebUI chat interface and OpenAI-compatible API endpoint.

Use Cases

  • Advanced reasoning and coding
  • Multilingual chatbots (201 languages)
  • Long document analysis (256K context)
  • Agentic applications

Related Models

Ready to deploy Qwen3.5?

Pick your GPU and have it running in minutes. No infrastructure setup required.