Skip to main content

Documentation

Documentation/Model Catalog

Model Catalog

Looking for detailed model specs, GPU recommendations, and pricing? Visit our Model Catalog.

Browse our curated selection of AI models. All models come pre-configured with optimal settings and are ready to deploy in minutes.

Text Generation Models

Chat, code generation, and general language tasks. All text models include OpenWebUI for easy interaction.

DeepSeek R1 (8B, 14B, 32B, 70B)

Top-tier reasoning with step-by-step thinking. Most popular open-source family.

$0.53-1.85/hr

L4-A100 GPU

DeepSeek V3.1 671B MoE

Hybrid thinking/non-thinking mode. Massive 671B MoE architecture.

$3.54/hr

H100 GPU

Qwen3 (4B, 8B, 14B, 32B)

Dual thinking/non-thinking modes. 119 languages, 128K context.

$0.53-0.66/hr

L4-A6000 GPU

Qwen3 30B-A3B MoE

Faster and cheaper than 32B with 256K context. MoE with 3B active params.

$0.53/hr

L4 GPU

QwQ 32B

Specialized for math and logic reasoning.

$0.66/hr

A6000 GPU

GPT-OSS (20B, 120B)

OpenAI open-weight models. Native function calling, chain-of-thought visible.

$0.53-1.85/hr

L4-A100 GPU

Gemma 3 (4B, 12B, 27B)

Google's efficient model family. Best quality-to-size ratio.

$0.53-0.66/hr

L4-A6000 GPU

Phi-4 14B

Microsoft's best small model. Top reasoning for its size.

$0.53/hr

L4 GPU

LLaMA 3.3 70B

Meta's flagship model. Strong general performance.

$1.85/hr

A100 GPU

Mistral (7B, Nemo 12B)

Fast and efficient. Nemo variant has 128K context for documents.

$0.53/hr

L4 GPU

GLM-4 9B

Best for Chinese + English bilingual tasks. 128K context.

$0.53/hr

L4 GPU

Magistral 24B

Legal and financial analysis with transparent reasoning.

$0.66/hr

A6000 GPU

Image Generation Models

Create stunning images from text prompts. All image models use ComfyUI for powerful workflow editing.

Qwen-Image-2512

#1 open-source image model. Superior text rendering, realistic humans.

$0.66/hr

A6000 GPU

Z Image Turbo

Ultra-fast distilled model. Sub-second generation.

$0.66/hr

A6000 GPU

Flux (Dev, Schnell, Krea)

Excellent text rendering and composition. Schnell for speed, Dev for quality.

$0.53-0.66/hr

L4-A6000 GPU

FLUX.2 (Dev FP8, Dev Full, Q4 GGUF)

32B next-gen model with multi-image editing. Best open-weight quality.

$0.79-3.54/hr

RTX 4090-H100 GPU

FLUX.2 Klein (9B FP8, 4B, 9B Base)

Fastest Flux model. Sub-second generation, 4-step distilled.

$0.53-0.79/hr

L4-RTX 4090 GPU

HiDream I1 (Dev, Full, Fast FP8)

17B parameter model with excellent prompt following.

$0.66/hr

A6000 GPU

Stable Diffusion (XL, 3.5, 1.5)

Most fine-tunes and LoRAs available. Widely adopted ecosystem.

$0.53-0.66/hr

L4-A6000 GPU

Qwen Image (Edit, Gen)

Precise text and semantic editing. Bilingual prompts.

$0.66/hr

A6000 GPU

Video Generation Models

Generate videos from text or images. Video models require more GPU power and time per generation.

Wan 2.2 5B (T2V, I2V)

Best value video generation. Fast MoE architecture, ~14GB VRAM.

$0.66/hr

A6000 GPU

Wan 2.2 14B (T2V, I2V)

Higher quality text-to-video and image-to-video. ~40GB VRAM for HD.

$1.85/hr

A100 GPU

HunyuanVideo 1.5 (T2V, I2V)

Tencent's flagship. 8.3B params, consumer GPU friendly. ~14GB VRAM.

$0.66/hr

A6000 GPU

LTX-2 19B (Distilled, Dev FP8, Dev FP4)

Fast 8-step generation with 4K and audio support. By Lightricks.

$0.66-1.85/hr

A6000-A100 GPU

Wan 2.1 (T2V, I2V) Legacy

Previous generation. Consider Wan 2.2 for better quality.

$0.66/hr

A6000 GPU

Audio / Text-to-Speech Models

Generate natural speech from text. Models include voice cloning, multi-language support, and emotion control. Kokoro and Chatterbox use a Gradio interface; ComfyUI Audio Suite uses a visual workflow.

Kokoro 82M

Fast, high-quality TTS. 6 languages, 30+ voices. Apache 2.0 license.

$0.53/hr

L4 GPU

Chatterbox Turbo 350M

Fast TTS with voice cloning and paralinguistic tags. MIT license.

$0.53/hr

L4 GPU

Chatterbox Standard 500M

High-quality TTS with emotion control and voice cloning. MIT license.

$0.53/hr

L4 GPU

ComfyUI Audio Suite

Visual TTS workflow with F5-TTS, Chatterbox, Kokoro, and Qwen3-TTS engines.

$0.53/hr

L4 GPU

GPU Reference

GPUVRAMPriceBest For
NVIDIA L424 GB$0.53/hrSmall-medium models, fast inference
NVIDIA RTX 409024 GB$0.79/hrFLUX.2, gaming GPUs, fast single-image generation
NVIDIA RTX A600048 GB$0.66/hrLarge image models, Flux, video
NVIDIA A10080 GB$1.85/hr70B+ text models, high-quality video
NVIDIA H10080 GB$3.54/hrMaximum performance, fastest inference

Ready to deploy?

Check out our Quick Deploy Guide to get started, or learn about ComfyUI and OpenWebUI interfaces.

Deploy a Model