Skip to content
ComfyUI Wiki
Help Build a Better ComfyUI Knowledge Base Become a Patron
NewsInfiniteTalk Open Source Release - Audio-Driven Video Generation with Unlimited Length Support

InfiniteTalk Open Source Release - Audio-Driven Video Generation with Unlimited Length Support

InfiniteTalk Demo

The MeiGen-AI team has recently open-sourced the InfiniteTalk model, an innovative project that enables audio-driven video generation with unlimited length support. This technology not only achieves precise lip-sync but also maintains stable body movements and facial expressions, marking a significant breakthrough in digital human technology.

Key Features

InfiniteTalk employs a sparse-frame video dubbing framework. Compared to traditional methods that focus solely on lip-sync, this technology offers several notable advantages:

  • Precise Lip-Sync: Accurate mouth shape matching with audio
  • Unlimited Length Generation: Support for ultra-long video content generation
  • Full-Body Motion Sync: Synchronization of head, body, and facial expressions in addition to lips
  • Stable Identity Preservation: Maintaining character identity consistency during long-duration generation
  • Multi-Scenario Support: Compatible with both image-to-video and video-to-video conversion

Core Functionality

Audio-Driven Video Generation

InfiniteTalk can generate video content synchronized with input audio files. Whether it’s speech or singing, it produces natural lip-sync effects.

Unlimited Length Support

This technology breaks through traditional video generation length limitations, theoretically enabling the creation of videos of any length. It’s particularly suitable for producing long-duration digital human explanation videos.

Multi-Resolution Support

The model supports both 480P and 720P resolutions, allowing users to choose the appropriate output quality based on their needs.

Technical Architecture

InfiniteTalk is built upon the Wan2.1 model, utilizing innovative sparse-frame processing technology for efficient video generation. The model employs a context window mechanism, with a default setting of 81 frames for the context window, which is the key technology enabling infinite generation.

Open Source Information

The InfiniteTalk project has been open-sourced on GitHub under the Apache 2.0 license. The project includes complete model weights, code implementation, and documentation, providing researchers and developers with a comprehensive solution.