Looking for detailed model specs, GPU recommendations, and pricing? Visit our Model Catalog.
Browse our curated selection of AI models. All models come pre-configured with optimal settings and are ready to deploy in minutes.
Chat, code generation, and general language tasks. All text models include OpenWebUI for easy interaction.
Top-tier reasoning with step-by-step thinking. Most popular open-source family.
L4-A100 GPU
Hybrid thinking/non-thinking mode. Massive 671B MoE architecture.
H100 GPU
Dual thinking/non-thinking modes. 119 languages, 128K context.
L4-A6000 GPU
Faster and cheaper than 32B with 256K context. MoE with 3B active params.
L4 GPU
Specialized for math and logic reasoning.
A6000 GPU
OpenAI open-weight models. Native function calling, chain-of-thought visible.
L4-A100 GPU
Google's efficient model family. Best quality-to-size ratio.
L4-A6000 GPU
Microsoft's best small model. Top reasoning for its size.
L4 GPU
Meta's flagship model. Strong general performance.
A100 GPU
Fast and efficient. Nemo variant has 128K context for documents.
L4 GPU
Best for Chinese + English bilingual tasks. 128K context.
L4 GPU
Legal and financial analysis with transparent reasoning.
A6000 GPU
Create stunning images from text prompts. All image models use ComfyUI for powerful workflow editing.
#1 open-source image model. Superior text rendering, realistic humans.
A6000 GPU
Ultra-fast distilled model. Sub-second generation.
A6000 GPU
Excellent text rendering and composition. Schnell for speed, Dev for quality.
L4-A6000 GPU
32B next-gen model with multi-image editing. Best open-weight quality.
RTX 4090-H100 GPU
Fastest Flux model. Sub-second generation, 4-step distilled.
L4-RTX 4090 GPU
17B parameter model with excellent prompt following.
A6000 GPU
Most fine-tunes and LoRAs available. Widely adopted ecosystem.
L4-A6000 GPU
Precise text and semantic editing. Bilingual prompts.
A6000 GPU
Generate videos from text or images. Video models require more GPU power and time per generation.
Best value video generation. Fast MoE architecture, ~14GB VRAM.
A6000 GPU
Higher quality text-to-video and image-to-video. ~40GB VRAM for HD.
A100 GPU
Tencent's flagship. 8.3B params, consumer GPU friendly. ~14GB VRAM.
A6000 GPU
Fast 8-step generation with 4K and audio support. By Lightricks.
A6000-A100 GPU
Previous generation. Consider Wan 2.2 for better quality.
A6000 GPU
Generate natural speech from text. Models include voice cloning, multi-language support, and emotion control. Kokoro and Chatterbox use a Gradio interface; ComfyUI Audio Suite uses a visual workflow.
Fast, high-quality TTS. 6 languages, 30+ voices. Apache 2.0 license.
L4 GPU
Fast TTS with voice cloning and paralinguistic tags. MIT license.
L4 GPU
High-quality TTS with emotion control and voice cloning. MIT license.
L4 GPU
Visual TTS workflow with F5-TTS, Chatterbox, Kokoro, and Qwen3-TTS engines.
L4 GPU
| GPU | VRAM | Price | Best For |
|---|---|---|---|
| NVIDIA L4 | 24 GB | $0.53/hr | Small-medium models, fast inference |
| NVIDIA RTX 4090 | 24 GB | $0.79/hr | FLUX.2, gaming GPUs, fast single-image generation |
| NVIDIA RTX A6000 | 48 GB | $0.66/hr | Large image models, Flux, video |
| NVIDIA A100 | 80 GB | $1.85/hr | 70B+ text models, high-quality video |
| NVIDIA H100 | 80 GB | $3.54/hr | Maximum performance, fastest inference |
Check out our Quick Deploy Guide to get started, or learn about ComfyUI and OpenWebUI interfaces.
Deploy a Model