Qwen-Image gets native support in ComfyUI
Qwen-Image is a 20B-parameter MMDiT (Multimodal Diffusion Transformer) image generation model designed for complex text rendering and fine-grained editing. It is open-sourced under the Apache‑2.0 license. The model recently gained native support in ComfyUI, making it easy to try via templates.
Related links
Model highlights
Based on the project page, the model excels in text-centric scenarios and editing consistency, while offering broad generation and understanding capabilities:
- Complex text rendering: preserves typographic details and layout consistency across languages (e.g., Chinese and English); suited to images with headings, slogans, and structured layouts
- Precise image editing: supports style transfer, object insertion/removal, detail enhancement, text editing within images, and even human pose adjustment
- General generation ability: smoothly adapts to many styles—from photorealistic to impressionist, anime, and minimalist design
- Image understanding tasks: object detection, semantic segmentation, depth and edge (Canny) estimation, novel‑view synthesis, and super‑resolution
- Ecosystem and extensibility: updates indicate support for various LoRA (e.g., MajicBeauty) and provide multi‑GPU inference/queue‑management examples for scalable, high‑concurrency scenarios
Versions currently available in ComfyUI
- Qwen-Image_bf16 (≈ 40.9 GB)
- Qwen-Image_fp8 (≈ 20.4 GB)
- Unofficial distilled variants (fewer inference steps)
Model resources are available here: Hugging Face - Comfy-Org/Qwen-Image_ComfyUI | ModelScope - Comfy-Org/Qwen-Image_ComfyUI
Performance
Below are measurements taken by the ComfyUI Wiki while preparing official documentation, using an RTX 4090D 24 GB:
Qwen-Image_fp8
- VRAM: 86%
- Generation time: 94 s (first run), 71 s (second)
Qwen-Image_bf16
- VRAM: 96%
- Generation time: 295 s (first run), 131 s (second)
Sources and further reading
- Project page (features, news, deployment): Qwen-Image GitHub
- Technical report (arXiv): Qwen-Image Technical Report
- Model resources (community mirrors): Comfy-Org/Qwen-Image_ComfyUI | ModelScope - Comfy-Org/Qwen-Image_ComfyUI
- Additional reading (tutorial): ComfyUI Docs · Qwen-Image native workflow