Wan2.2 Fun Control ComfyUI Workflow Complete Usage Guide, Official + Community Versions (Kijai, GGUF)
This tutorial will comprehensively introduce various implementation methods and usage of the Wan2.2 Fun Control video control generation model in ComfyUI. Wan2.2 Fun Control is a new generation of video generation and control model launched by Alibaba Cloud. By introducing an innovative control code (Control Codes) mechanism combined with deep learning and multimodal conditional input, it can generate high-quality videos that meet preset control conditions.
Versions and Content Covered in This Tutorial
Completed Versions:
- โ ComfyUI Official Native Version - Complete workflow provided in the official ComfyOrg documentation
- โ Wan2.2 Fun Control 14B Video Control Version - High-quality multimodal control video generation
Versions in Preparation:
- ๐ Kijai WanVideoWrapper Version - Community-developed convenient wrapper
- ๐ GGUF Quantized Version - Optimized version for low-configuration devices
Model Technical Features
Wan2.2 Fun Control is based on the Wan2.2 architecture and has been specifically optimized for video control generation, with the following core features:
Core Advantages:
- Multimodal Control: Supports multiple control conditions, including Canny (line drawing), Depth (depth), OpenPose (human pose), MLSD (geometric edges), etc., while also supporting trajectory control
- High-Quality Video Generation: Based on the Wan2.2 architecture, outputting cinema-level quality videos
- Multilingual Support: Supports multilingual prompt input including Chinese and English
- Multi-Resolution Support: Supports generating videos at resolutions such as 512ร512, 768ร768, 1024ร1024, adapting to different scenario requirements
Open Source License Description
The Wan2.2 Fun Control series models are based on the Apache2.0 open source license, supporting commercial use. The Apache2.0 license allows you to freely use, modify and distribute these models, including for commercial purposes, as long as you retain the original copyright notice and license text.
Wan2.2 Fun Control Open Source Model Version Overview
Model Type | Model Name | Parameters | Main Function | Model Repository |
---|---|---|---|---|
Video Control | Wan2.2-Fun-A14B-Control | 14B | Supports different control conditions such as Canny, Depth, Pose, MLSD, etc., while also supporting trajectory control | ๐ค Wan2.2-Fun-A14B-Control |
Related Code Repositories
- VideoX-Fun GitHub Repository - Official complete implementation code
- Wan2.2 Fun Control Official Documentation - Detailed model description and usage guide
Wan2.2 Fun Control ComfyUI Official Native Version Workflow Usage Guide
Version Description
The ComfyUI official native version is provided by the ComfyOrg team, using repackaged model files to ensure optimal compatibility with ComfyUI. This version supports both standard mode and Lightx2v 4-step LoRA acceleration mode.
Performance Comparison Test
Below are the test results using RTX4090D 24GB VRAM GPU for 640*640 resolution and 81-frame length:
Model Type | Resolution | VRAM Usage | First Generation Time | Second Generation Time |
---|---|---|---|---|
fp8_scaled | 640ร640 | 83% | โ 524 seconds | โ 520 seconds |
fp8_scaled + 4-step LoRA acceleration | 640ร640 | 89% | โ 138 seconds | โ 79 seconds |
Since 4-step LoRA is used, the user experience for first-time workflow users is better, but it may cause loss of video dynamics. We have enabled the accelerated LoRA version by default. If you need to enable another set of workflows, select it and use Ctrl+B to enable it.
1. Wan2.2 Fun Control Video Control Generation ComfyUI Workflow
Workflow Acquisition Method
Download the video or JSON file below and drag it into ComfyUI to load the corresponding workflow
Download JSON Format Workflow
Please download the images and videos below, which we will use as input.
Here we use a preprocessed video that can be directly used for control video generation
Model File Download
You can find the following models in Wan_2.2_ComfyUI_Repackaged
Diffusion Model
- wan2.2_fun_control_high_noise_14B_fp8_scaled.safetensors
- wan2.2_fun_control_low_noise_14B_fp8_scaled.safetensors
Wan2.2-Lightning LoRA (Optional, for acceleration)
- wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors
- wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors
VAE
Text Encoder
ComfyUI/
โโโโ๐ models/
โ โโโโ๐ diffusion_models/
โ โ โโโโ wan2.2_fun_control_low_noise_14B_fp8_scaled.safetensors
โ โ โโโโ wan2.2_fun_control_high_noise_14B_fp8_scaled.safetensors
โ โโโโ๐ loras/
โ โ โโโโ wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors
โ โ โโโโ wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors
โ โโโโ๐ text_encoders/
โ โ โโโโ umt5_xxl_fp8_e4m3fn_scaled.safetensors
โ โโโโ๐ vae/
โ โโโ wan_2.1_vae.safetensors
Detailed Operation Steps
This workflow uses LoRA, please ensure the corresponding Diffusion model and LoRA are consistent. High noise and low noise models and LoRA need to be used correspondingly.
- High noise model and LoRA loading
- Ensure the
Load Diffusion Model
node loads thewan2.2_fun_control_high_noise_14B_fp8_scaled.safetensors
model - Ensure the
LoraLoaderModelOnly
node loads thewan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors
- Low noise model and LoRA loading
- Ensure the
Load Diffusion Model
node loads thewan2.2_fun_control_low_noise_14B_fp8_scaled.safetensors
model - Ensure the
LoraLoaderModelOnly
node loads thewan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors
- Ensure the
Load CLIP
node loads theumt5_xxl_fp8_e4m3fn_scaled.safetensors
model - Ensure the
Load VAE
node loads thewan_2.1_vae.safetensors
model - Upload the starting frame in the
Load Image
node - In the second
Load video
node, control the pose video. The provided video has been preprocessed and can be used directly - Since the video we provide is a preprocessed pose video, the corresponding video image preprocessing nodes need to be disabled. You can select them and use Ctrl + B` to disable them
- Modify Prompt - both Chinese and English are supported
- In
Wan22FunControlToVideo
, modify the corresponding video size. The default is set to 640*640 resolution to avoid excessive time consumption for low VRAM users using this workflow - Click the
Run
button, or use the shortcutCtrl(cmd) + Enter
to execute video generation
Additional Notes
Since in the built-in nodes of ComfyUI, the preprocessor nodes only have Canny preprocessors, you can use similar ComfyUI-comfyui_controlnet_aux to implement other types of image preprocessing
Wan2.2 Fun Control Kijai WanVideoWrapper ComfyUI Workflow
This content is being prepared and will be updated soon.
This part of the tutorial will introduce the convenient method using Kijai/ComfyUI-WanVideoWrapper.
Related model repository: https://huggingface.co/Kijai/WanVideo_comfy_fp8_scaled
Wan2.2 Fun Control GGUF Quantized Version ComfyUI Workflow
This content is being prepared and will be updated soon.
The GGUF version is suitable for users with limited VRAM, providing the following resources:
QuantStack/Wan2.2-Fun-A14B-Control-GGUF
Related Custom Nodes: City96/ComfyUI-GGUF