Skip to content
ComfyUI Wikiβ€’
Help Build a Better ComfyUI Knowledge Base Become a Patron

Wan2.2 ComfyUI Workflow Complete Usage Guide, Official + Community Versions (Kijai, GGUF) Workflow Strategy

Wan2.2

Tutorial Overview

This tutorial will comprehensively introduce various implementation methods and usage of the Wan2.2 video generation model in ComfyUI. Wan2.2 is a new generation of multimodal generation model launched by Alibaba Cloud, adopting an innovative MoE (Mixture of Experts) architecture with core features such as film-level aesthetic control, large-scale complex motion generation, and precise semantic compliance.

Versions and Content Covered in This Tutorial

Completed Versions:

  • βœ… ComfyUI Official Native Version - Complete workflow provided by ComfyOrg official
  • βœ… Wan2.2 5B Hybrid Version - Lightweight model supporting text-to-video and image-to-video
  • βœ… Wan2.2 14B Text-to-Video Version - High-quality text-to-video generation
  • βœ… Wan2.2 14B Image-to-Video Version - Static image to dynamic video
  • βœ… Wan2.2 14B First-Last Frame Video Generation - Video generation based on start and end frames

Versions in Preparation:

  • πŸ”„ Kijai WanVideoWrapper Version
  • πŸ”„ GGUF Quantized Version - Optimized version for low-configuration devices
  • πŸ”„ Lightx2v 4steps LoRA - Fast generation optimization solution

About Wan2.2 Video Generation Model

Wan2.2 adopts an innovative MoE (Mixture of Experts) architecture, composed of high-noise expert models and low-noise expert models, which can divide expert models according to denoising time steps to generate higher quality video content.

Core Advantages:

  • Film-Level Aesthetic Control: Professional lens language, supporting multi-dimensional visual control of lighting, color, composition, etc.
  • Large-Scale Complex Motion: Smoothly reproduces various complex motions, strengthening motion controllability and naturalness
  • Precise Semantic Compliance: Complex scene understanding, multi-object generation, better restoration of creative intent
  • Efficient Compression Technology: 5B version high compression ratio VAE, memory optimization, supporting hybrid training

The Wan2.2 series models are based on the Apache2.0 open source license, supporting commercial use. The Apache2.0 license allows you to freely use, modify and distribute these models, including commercial purposes, as long as you retain the original copyright notice and license text.

Wan2.2 Open Source Model Version Overview

Model TypeModel NameParametersMain FunctionModel Repository
Hybrid ModelWan2.2-TI2V-5B5BSupports text-to-video and image-to-video hybrid version, a single model meets two core task requirementsπŸ€— Wan2.2-TI2V-5B
Image-to-VideoWan2.2-I2V-A14B14BConverts static images to dynamic videos, maintaining content consistency and smooth dynamic processesπŸ€— Wan2.2-I2V-A14B
Text-to-VideoWan2.2-T2V-A14B14BGenerates high-quality videos from text descriptions, with film-level aesthetic control and precise semantic complianceπŸ€— Wan2.2-T2V-A14B

Wan2.2 Prompt Guide - Detailed prompt writing guide provided by Wan

ComfyUI Official Resources

ComfyOrg Official Live Broadcast Replay

ComfyOrg’s YouTube has detailed explanations of using Wan2.2 in ComfyUI:

ComfyUI Wan2.2 Live Broadcast Replay
ComfyUI Wan2.2 In-depth
ComfyUI Wan2.2 In-depth #2
Loading...

Wan2.2 ComfyUI Official Native Version Workflow Usage Guide

Version Description

The ComfyUI official native version is provided by the ComfyOrg team, using πŸ€— Comfy-Org/Wan_2.2_ComfyUI_Repackaged repackaged model files to ensure optimal compatibility with ComfyUI.

Wan2.2 template

1. Wan2.2 TI2V 5B Hybrid Version Workflow

πŸ’‘

The Wan2.2 5B version combined with ComfyUI’s native offloading function can adapt well to 8GB VRAM, making it an ideal choice for beginner users.

Workflow Acquisition Method

Please update your ComfyUI to the latest version, and find β€œWan2.2 5B video generation” through the menu Workflow -> Browse Templates -> Video to load the workflow

Download JSON Format Workflow

Model File Download

Diffusion Model

VAE

Text Encoder

ComfyUI/
β”œβ”€β”€β”€πŸ“‚ models/
β”‚   β”œβ”€β”€β”€πŸ“‚ diffusion_models/
β”‚   β”‚   └───wan2.2_ti2v_5B_fp16.safetensors
β”‚   β”œβ”€β”€β”€πŸ“‚ text_encoders/
β”‚   β”‚   └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors 
β”‚   β””β”€β”€β”€πŸ“‚ vae/
β”‚       └── wan2.2_vae.safetensors

Detailed Operation Steps

Step Diagram

  1. Ensure the Load Diffusion Model node loads the wan2.2_ti2v_5B_fp16.safetensors model
  2. Ensure the Load CLIP node loads the umt5_xxl_fp8_e4m3fn_scaled.safetensors model
  3. Ensure the Load VAE node loads the wan2.2_vae.safetensors model
  4. (Optional) If you need to perform image-to-video, you can use the shortcut Ctrl+B to enable the Load image node to upload images
  5. (Optional) In Wan22ImageToVideoLatent you can adjust the size settings and video total frame count length adjustment
  6. (Optional) If you need to modify prompts (positive and negative), please modify them in the CLIP Text Encoder node numbered 5
  7. Click the Run button, or use the shortcut Ctrl(cmd) + Enter to execute video generation

2. Wan2.2 14B T2V Text-to-Video Workflow

Workflow Acquisition Method

Please update your ComfyUI to the latest version, and find β€œWan2.2 14B T2V” through the menu Workflow -> Browse Templates -> Video

Or update your ComfyUI to the latest version, then download the workflow below and drag it into ComfyUI to load the workflow

Model File Download

Diffusion Model

VAE

Text Encoder

ComfyUI/
β”œβ”€β”€β”€πŸ“‚ models/
β”‚   β”œβ”€β”€β”€πŸ“‚ diffusion_models/
β”‚   β”‚   β”œβ”€β”€β”€ wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors
β”‚   β”‚   └─── wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors
β”‚   β”œβ”€β”€β”€πŸ“‚ text_encoders/
β”‚   β”‚   └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors 
β”‚   β””β”€β”€β”€πŸ“‚ vae/
β”‚       └── wan_2.1_vae.safetensors

Detailed Operation Steps

Step Diagram

  1. Ensure the first Load Diffusion Model node loads the wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors model
  2. Ensure the second Load Diffusion Model node loads the wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors model
  3. Ensure the Load CLIP node loads the umt5_xxl_fp8_e4m3fn_scaled.safetensors model
  4. Ensure the Load VAE node loads the wan_2.1_vae.safetensors model
  5. (Optional) In EmptyHunyuanLatentVideo you can adjust the size settings and video total frame count length adjustment
  6. If you need to modify prompts (positive and negative), please modify them in the CLIP Text Encoder node numbered 6
  7. Click the Run button, or use the shortcut Ctrl(cmd) + Enter to execute video generation

3. Wan2.2 14B I2V Image-to-Video Workflow

Workflow Acquisition Method

Please update your ComfyUI to the latest version, and find β€œWan2.2 14B I2V” through the menu Workflow -> Browse Templates -> Video to load the workflow

Or update your ComfyUI to the latest version, then download the workflow below and drag it into ComfyUI to load the workflow

You can use the following image as input Input Image

Model File Download

Diffusion Model

VAE

Text Encoder

ComfyUI/
β”œβ”€β”€β”€πŸ“‚ models/
β”‚   β”œβ”€β”€β”€πŸ“‚ diffusion_models/
β”‚   β”‚   β”œβ”€β”€β”€ wan2.2_i2v_low_noise_14B_fp16.safetensors
β”‚   β”‚   └─── wan2.2_i2v_high_noise_14B_fp16.safetensors
β”‚   β”œβ”€β”€β”€πŸ“‚ text_encoders/
β”‚   β”‚   └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors 
β”‚   β””β”€β”€β”€πŸ“‚ vae/
β”‚       └── wan_2.1_vae.safetensors

Detailed Operation Steps

Step Diagram

  1. Ensure the first Load Diffusion Model node loads the wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors model
  2. Ensure the second Load Diffusion Model node loads the wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors model
  3. Ensure the Load CLIP node loads the umt5_xxl_fp8_e4m3fn_scaled.safetensors model
  4. Ensure the Load VAE node loads the wan_2.1_vae.safetensors model
  5. Upload the image as the starting frame in the Load Image node
  6. If you need to modify prompts (positive and negative), please modify them in the CLIP Text Encoder node numbered 6
  7. (Optional) In EmptyHunyuanLatentVideo you can adjust the size settings and video total frame count length adjustment
  8. Click the Run button, or use the shortcut Ctrl(cmd) + Enter to execute video generation

4. Wan2.2 14B FLF2V First-Last Frame Video Generation Workflow

The first-last frame workflow uses exactly the same model location as the I2V section

Workflow and Material Acquisition

Download the video or JSON format workflow below and open it in ComfyUI

Download the materials below as input

Input Material Input Material

Detailed Operation Steps

Step Diagram

  1. Upload the image as the starting frame in the first Load Image node
  2. Upload the image as the starting frame in the second Load Image node
  3. Modify the size settings on WanFirstLastFrameToVideo
    • The workflow defaults to a relatively small size to prevent low VRAM users from consuming too many resources
    • If you have sufficient VRAM, you can try around 720P size
  4. Write appropriate prompts according to your first-last frames
  5. Click the Run button, or use the shortcut Ctrl(cmd) + Enter to execute video generation

Wan2.2 Kijai WanVideoWrapper ComfyUI Workflow

⚠️

This content is being prepared and will be updated in the near future.

This part of the tutorial will introduce the convenient method using Kijai/ComfyUI-WanVideoWrapper.

Related model repository: https://huggingface.co/Kijai/WanVideo_comfy_fp8_scaled

Wan2.2 GGUF Quantized Version ComfyUI Workflow

⚠️

This content is being prepared and will be updated in the near future.

The GGUF version is suitable for users with limited VRAM, providing the following resources:

Related Custom Nodes: City96/ComfyUI-GGUF

Lightx2v 4steps LoRA Usage Instructions

⚠️

This content is being prepared and will be updated in the near future.

Lightx2v provides a fast generation optimization solution: