METHOD AND SYSTEM FOR PERSONALIZED MULTI-SUBJECT TEXT TO IMAGE GENERATION

    公开(公告)号:US20250166134A1

    公开(公告)日:2025-05-22

    申请号:US18903256

    申请日:2024-10-01

    Abstract: Text-to-image models are used to generate images based on text prompts. Existing text-to-image models create images that are often unclear and exhibit hybrid characteristics of multiple subjects i.e., each subject present in image exhibit characteristic of multiple subjects. Present disclosure provides a method and a system for personalized multi-subject text to image generation. The system first fine-tunes existing text-to-image diffusion model using a plurality of images of target subjects. Then, the system performs image generation based on local text prompts using the fine-tuned text-to-image diffusion model. In particular, the fine-tuned text-to-image diffusion model uses a composite diffusion algorithm for generating subject images. Thereafter, the system model computes a subject aware segmentation loss for generated images which is then used to correct the subject appearance in the generated images. Finally, the system applies a global diffuser to the generated images to create a harmonized image based on a global text prompt.

Patent Agency Ranking