Model Catalog

Looking for detailed model specs, GPU recommendations, and pricing? Visit our Model Catalog.

Browse our curated selection of AI models. All models come pre-configured with optimal settings and are ready to deploy in minutes.

Text Generation Models

Chat, code generation, and general language tasks. All text models include OpenWebUI for easy interaction.

DeepSeek R1 (8B, 14B, 32B, 70B)

Top-tier reasoning with step-by-step thinking. Most popular open-source family.

$0.53-1.85/hr

L4-A100 GPU

DeepSeek V3.1 671B MoE

Hybrid thinking/non-thinking mode. Massive 671B MoE architecture.

$3.54/hr

H100 GPU

Qwen3 (4B, 8B, 14B, 32B)

Dual thinking/non-thinking modes. 119 languages, 128K context.

$0.53-0.66/hr

L4-A6000 GPU

Qwen3 30B-A3B MoE

Faster and cheaper than 32B with 256K context. MoE with 3B active params.

$0.53/hr

L4 GPU

QwQ 32B

Specialized for math and logic reasoning.

$0.66/hr

A6000 GPU

GPT-OSS (20B, 120B)

OpenAI open-weight models. Native function calling, chain-of-thought visible.

$0.53-1.85/hr

L4-A100 GPU

Gemma 3 (4B, 12B, 27B)

Google's efficient model family. Best quality-to-size ratio.

$0.53-0.66/hr

L4-A6000 GPU

Phi-4 14B

Microsoft's best small model. Top reasoning for its size.

$0.53/hr

L4 GPU

LLaMA 3.3 70B

Meta's flagship model. Strong general performance.

$1.85/hr

A100 GPU

Mistral (7B, Nemo 12B)

Fast and efficient. Nemo variant has 128K context for documents.

$0.53/hr

L4 GPU

GLM-4 9B

Best for Chinese + English bilingual tasks. 128K context.

$0.53/hr

L4 GPU

Magistral 24B

Legal and financial analysis with transparent reasoning.

$0.66/hr

A6000 GPU

Image Generation Models

Create stunning images from text prompts. All image models use ComfyUI for powerful workflow editing.

Qwen-Image-2512

#1 open-source image model. Superior text rendering, realistic humans.

$0.66/hr

A6000 GPU

Z Image Turbo

Ultra-fast distilled model. Sub-second generation.

$0.66/hr

A6000 GPU

Flux (Dev, Schnell, Krea)

Excellent text rendering and composition. Schnell for speed, Dev for quality.

$0.53-0.66/hr

L4-A6000 GPU

FLUX.2 (Dev FP8, Dev Full, Q4 GGUF)

32B next-gen model with multi-image editing. Best open-weight quality.

$0.79-3.54/hr

RTX 4090-H100 GPU

FLUX.2 Klein (9B FP8, 4B, 9B Base)

Fastest Flux model. Sub-second generation, 4-step distilled.

$0.53-0.79/hr

L4-RTX 4090 GPU

HiDream I1 (Dev, Full, Fast FP8)

17B parameter model with excellent prompt following.

$0.66/hr

A6000 GPU

Stable Diffusion (XL, 3.5, 1.5)

Most fine-tunes and LoRAs available. Widely adopted ecosystem.

$0.53-0.66/hr

L4-A6000 GPU

Qwen Image (Edit, Gen)

Precise text and semantic editing. Bilingual prompts.

$0.66/hr

A6000 GPU

Video Generation Models

Generate videos from text or images. Video models require more GPU power and time per generation.

Wan 2.2 5B (T2V, I2V)

Best value video generation. Fast MoE architecture, ~14GB VRAM.

$0.66/hr

A6000 GPU

Wan 2.2 14B (T2V, I2V)

Higher quality text-to-video and image-to-video. ~40GB VRAM for HD.

$1.85/hr

A100 GPU

HunyuanVideo 1.5 (T2V, I2V)

Tencent's flagship. 8.3B params, consumer GPU friendly. ~14GB VRAM.

$0.66/hr

A6000 GPU

LTX-2 19B (Distilled, Dev FP8, Dev FP4)

Fast 8-step generation with 4K and audio support. By Lightricks.

$0.66-1.85/hr

A6000-A100 GPU

Wan 2.1 (T2V, I2V) Legacy

Previous generation. Consider Wan 2.2 for better quality.

$0.66/hr

A6000 GPU

Audio / Text-to-Speech Models

Generate natural speech from text. Models include voice cloning, multi-language support, and emotion control. Kokoro and Chatterbox use a Gradio interface; ComfyUI Audio Suite uses a visual workflow.

Kokoro 82M

Fast, high-quality TTS. 6 languages, 30+ voices. Apache 2.0 license.

$0.53/hr

L4 GPU

Chatterbox Turbo 350M

Fast TTS with voice cloning and paralinguistic tags. MIT license.

$0.53/hr

L4 GPU

Chatterbox Standard 500M

High-quality TTS with emotion control and voice cloning. MIT license.

$0.53/hr

L4 GPU

ComfyUI Audio Suite

Visual TTS workflow with F5-TTS, Chatterbox, Kokoro, and Qwen3-TTS engines.

$0.53/hr

L4 GPU

GPU Reference

GPU	VRAM	Price	Best For
NVIDIA L4	24 GB	$0.53/hr	Small-medium models, fast inference
NVIDIA RTX 4090	24 GB	$0.79/hr	FLUX.2, gaming GPUs, fast single-image generation
NVIDIA RTX A6000	48 GB	$0.66/hr	Large image models, Flux, video
NVIDIA A100	80 GB	$1.85/hr	70B+ text models, high-quality video
NVIDIA H100	80 GB	$3.54/hr	Maximum performance, fastest inference

Ready to deploy?

Check out our Quick Deploy Guide to get started, or learn about ComfyUI and OpenWebUI interfaces.

Deploy a Model

Documentation

Documentation

Model Catalog

Text Generation Models

DeepSeek R1 (8B, 14B, 32B, 70B)

DeepSeek V3.1 671B MoE

Qwen3 (4B, 8B, 14B, 32B)

Qwen3 30B-A3B MoE

QwQ 32B

GPT-OSS (20B, 120B)

Gemma 3 (4B, 12B, 27B)

Phi-4 14B

LLaMA 3.3 70B

Mistral (7B, Nemo 12B)

GLM-4 9B

Magistral 24B

Image Generation Models

Qwen-Image-2512

Z Image Turbo

Flux (Dev, Schnell, Krea)

FLUX.2 (Dev FP8, Dev Full, Q4 GGUF)

FLUX.2 Klein (9B FP8, 4B, 9B Base)

HiDream I1 (Dev, Full, Fast FP8)

Stable Diffusion (XL, 3.5, 1.5)

Qwen Image (Edit, Gen)

Video Generation Models

Wan 2.2 5B (T2V, I2V)

Wan 2.2 14B (T2V, I2V)

HunyuanVideo 1.5 (T2V, I2V)

LTX-2 19B (Distilled, Dev FP8, Dev FP4)

Wan 2.1 (T2V, I2V) Legacy

Audio / Text-to-Speech Models

Kokoro 82M

Chatterbox Turbo 350M

Chatterbox Standard 500M

ComfyUI Audio Suite

GPU Reference

Ready to deploy?