Skip to content
Become a Patron Help Build a Better ComfyUI Knowledge Base
NewsAlibaba's Wan2.1 Video Generation Model Officially Open-Sourced

Alibaba’s Wan2.1 Video Generation Model Officially Open-Sourced

On February 25, 2025, Alibaba announced that its latest generation video generation model, Wan2.1, has been officially open-sourced, marking a significant milestone. This model not only outperforms existing open-source models in terms of performance but also significantly lowers the barrier to entry with its lightweight version requiring only 8GB of video memory.

Key Highlights

Wan2.1 has achieved significant technological breakthroughs in multiple areas:

1. Exceptional Performance and Low Resource Requirements

  • Ranked first on the VBench leaderboard with a total score of 86.22%, surpassing models like Sora (84.28%) and Luma (83.61%)
  • The lightweight T2V-1.3B version requires only 8.19GB of video memory, making it possible to run on consumer-grade graphics cards
  • Supports the generation of 8K resolution videos with details reaching cinematic standards

2. Comprehensive Functionality Support

  • Supports multiple tasks such as text-to-video (T2V), image-to-video (I2V), and video editing
  • First to introduce bilingual (Chinese and English) text effect generation, supporting dynamic subtitles and artistic fonts
  • Adds video-to-audio (V2A) functionality, achieving synchronized audio and video generation

3. Innovative Technical Architecture

  • Trained using the linear noise trajectory Flow Matching paradigm
  • The Wan-VAE encoder can handle videos of any length at 1080P resolution
  • The 3D causal convolution module enhances physical simulation capabilities

Version Selection and Hardware Requirements

Wan2.1 offers two versions to cater to different scenarios:

  1. Speed Edition (1.3B)

    • Requires only 8.19GB of video memory
    • Suitable for individual developers
    • 5-second 480P video generation time is approximately 4 minutes
  2. Professional Edition (14B)

    • Supports 720P professional-level rendering
    • Suitable for film and television industry applications
    • Offers a richer set of special effects interfaces

Open-Source Resource Acquisition

All models are now available for download on the Hugging Face and ModelScope platforms:

Application Scenarios

The application scope of Wan2.1 is broad, primarily including:

Personal Creation

  • Short video content generation
  • Artistic creation assistance
  • Image animation

Professional Production

  • Film and television special effects production
  • Advertising creative design
  • Educational resource production

Industrial Applications

  • Product demonstration animation
  • Architectural visualization
  • Industrial process visualization

Future Prospects

The open-sourcing of Wan2.1 will bring new opportunities to AI video creation. Especially with its low hardware requirements, more individual developers and small teams can participate in AI video generation practices. This will not only promote the spread of technology but also drive innovation in the entire industry.