Content customization and composition in diffusion
Oct 2024
US App. 18/913,107
We present a unified diffusion-based framework for content customization and composition in images. Unlike traditional approaches that require separate specialized models and expensive fine-tuning for tasks such as custom image generation, object insertion, and localized editing, our method performs all these operations via a single model and novel training and inference strategies. Key innovations include generic content insertion and harmonization into user-provided backgrounds, text- and image-conditioned editing and styling, attention-based blending for seamless integration, and consistent object style control—all without the need for per-object fine-tuning at inference. This enables scalable, efficient, and flexible image creation and editing.