PUSA V1.0: Low-Cost, High-Performance Video Generation Model Released

On July 16, 2025, PUSA V1.0 was officially released. Based on the latest Wan2.1-T2V-14B, this model introduces Vectorized Timestep Adaptation (VTA) technology, requiring only 1/2500 of the original dataset, 1/200 of the training cost, and 1/5 of the inference steps, yet surpasses the performance of Wan-I2V-14B.

What is PUSA V1.0?

PUSA V1.0 is an open-source AI model for video generation, featuring the new Vectorized Timestep Adaptation (VTA) technology. Unlike traditional video diffusion models that use a single timestep, PUSA enables more detailed noise control for each frame, resulting in higher generation quality and richer multi-task capabilities.

Key Features and Innovations

Vectorized Timestep Adaptation (VTA): Breaks the limitation of scalar timesteps, enabling flexible frame-level control.
Highly Efficient: Uses only 3,860 video samples, about $500 in training cost, and significantly fewer inference steps.
Multi-Task Support: Supports image-to-video (I2V), keyframe generation, video completion, video extension, text-to-video (T2V), video transitions, and more.
Non-Destructive Fine-Tuning: Adds new features via LoRA fine-tuning while retaining all original model capabilities, ensuring strong compatibility.
Open Source: Model weights, training data, inference, and training code are fully open for community and industry research and application.

Comparison with Wan-I2V

PUSA V1.0 surpasses Wan-I2V-14B in performance with much lower training resources and data. Wan-I2V supports only image-to-video, while PUSA V1.0 unifies multiple tasks and scores higher in VBench-I2V evaluation (87.32% vs 86.86%).

Application Scenarios

AI Creative Video Generation: Quickly generate high-quality short videos from an image or text.
Video Completion and Extension: Complete or extend existing videos, including keyframe completion.
Multi-Frame Keyframe Interpolation: Generate smooth video transitions from multiple keyframes.
Education, Entertainment, Advertising: Provides efficient video generation tools for creators, educators, and advertisers.

Visual Demos

Below are some animated examples from PUSA V0.5. V1.0 further improves multi-task capabilities and generation quality:

The release of PUSA V1.0 makes video generation technology more accessible and efficient. Its innovative VTA method not only improves quality but also greatly lowers the barrier for development and application.

What is PUSA V1.0?

Key Features and Innovations

Comparison with Wan-I2V

Application Scenarios

Visual Demos

Related Links