Boogu-Image-0.1-Edit: Open-Source Unified Image Editing with Apache 2.0
Boogu-Image-0.1-Edit is an Apache 2.0 licensed image editing model from the Boogu-Image family, offering instruction-based image editing with a unified multimodal understanding and generation architecture.
Overview
Boogu-Image-0.1 is a competitive open-source unified image generation and editing model family developed by the Boogu project. The family includes three main variants: Base (text-to-image), Turbo (4-step distilled fast generation), and Edit (image-to-image editing): all released under the Apache 2.0 license.
The Edit variant specifically focuses on instruction-based image editing: users provide a reference image along with a natural language instruction describing the desired edit, and the model generates the edited result while preserving the original image's structure and content.
Boogu-Image-0.1 achieved competitive results in the Boogu Arena (an LM Arena-style preference evaluation), ranking favorably against both closed-source systems and leading open-source alternatives across 1K+ diverse test prompts.
Key Features
| Feature | Description |
|---|---|
| Task | Instruction-based image-to-image editing |
| Architecture | Unified MLLM understanding + diffusion generation |
| License | Apache 2.0 (fully open-source) |
| Library | Diffusers (custom BooguImagePipeline) |
| Languages | English and Chinese optimized |
| ComfyUI | Native support in ComfyUI |
Model Architecture
Boogu-Image-0.1 employs a unified multimodal understanding and generation architecture that integrates:
- A multimodal large language model (MLLM) for understanding user instructions and image content
- A diffusion transformer for high-quality image generation
- A VAE for latent space encoding/decoding
This unified approach allows the model to achieve precise instruction following while maintaining high image quality. The Edit variant specifically leverages the MLLM's understanding of spatial relationships, object attributes, and editing instructions to produce coherent modifications.
Capabilities
Boogu-Image-0.1-Edit excels at a variety of image editing tasks:
- Object replacement: Swap objects in an image based on text descriptions
- Background changes: Modify backgrounds while preserving foreground subjects
- Style transfer: Apply artistic styles to existing images
- Local edits: Modify specific regions guided by text instructions
- Bilingual support: Handles both English and Chinese editing instructions
ComfyUI Integration
Boogu-Image-0.1-Edit is natively supported in ComfyUI. Get started quickly with the official Boogu Image Edit workflow.
Make sure you have updated ComfyUI to the latest version (update guide). The required model weights are available in the Comfy-Org/Boogu-Image repository on Hugging Face.
Online Demos
You can try Boogu-Image-0.1-Edit directly in your browser:
- Edit Demo: demo-edit.boogu.org
- Base Demo: demo-base.boogu.org
- Turbo Demo: demo-turbo.boogu.org
Availability
- Hugging Face (Edit): Boogu/Boogu-Image-0.1-Edit
- Hugging Face (Base): Boogu/Boogu-Image-0.1-Base
- GitHub: boogu-project/Boogu-Image
- Project Page: boogu.org
- Gallery: boogu-gallery.netlify.app
Summary
Boogu-Image-0.1-Edit brings competitive instruction-based image editing to the open-source community under a permissive Apache 2.0 license. With its unified MLLM architecture, strong bilingual support, and out-of-the-box ComfyUI integration, it represents a significant step forward for open-source image editing tools.