Qwen-Image gets native support in ComfyUI

Qwen-Image example

Qwen-Image is a 20B-parameter MMDiT (Multimodal Diffusion Transformer) image generation model designed for complex text rendering and fine-grained editing. It is open-sourced under the Apache‑2.0 license. The model recently gained native support in ComfyUI, making it easy to try via templates.

Related links

Model highlights

Based on the project page, the model excels in text-centric scenarios and editing consistency, while offering broad generation and understanding capabilities:

Complex text rendering: preserves typographic details and layout consistency across languages (e.g., Chinese and English); suited to images with headings, slogans, and structured layouts
Precise image editing: supports style transfer, object insertion/removal, detail enhancement, text editing within images, and even human pose adjustment
General generation ability: smoothly adapts to many styles—from photorealistic to impressionist, anime, and minimalist design
Image understanding tasks: object detection, semantic segmentation, depth and edge (Canny) estimation, novel‑view synthesis, and super‑resolution
Ecosystem and extensibility: updates indicate support for various LoRA (e.g., MajicBeauty) and provide multi‑GPU inference/queue‑management examples for scalable, high‑concurrency scenarios

Versions currently available in ComfyUI

Qwen-Image_bf16 (≈ 40.9 GB)
Qwen-Image_fp8 (≈ 20.4 GB)
Unofficial distilled variants (fewer inference steps)

Model resources are available here: Hugging Face - Comfy-Org/Qwen-Image_ComfyUI ｜ ModelScope - Comfy-Org/Qwen-Image_ComfyUI

Performance

Below are measurements taken by the ComfyUI Wiki while preparing official documentation, using an RTX 4090D 24 GB:

Qwen-Image_fp8

VRAM: 86%
Generation time: 94 s (first run), 71 s (second)

Qwen-Image_bf16

VRAM: 96%
Generation time: 295 s (first run), 131 s (second)

Sources and further reading

Project page (features, news, deployment): Qwen-Image GitHub
Technical report (arXiv): Qwen-Image Technical Report
Model resources (community mirrors): Comfy-Org/Qwen-Image_ComfyUI ｜ ModelScope - Comfy-Org/Qwen-Image_ComfyUI
Additional reading (tutorial): ComfyUI Docs · Qwen-Image native workflow