-
公开(公告)号:US20250028911A1
公开(公告)日:2025-01-23
申请号:US18355573
申请日:2023-07-20
Applicant: ADOBE INC.
Inventor: Akshay Ganesh Iyer , Nikunj Goyal , Kanad Shrikar Pardeshi , Pranamya Prashant Kulkarni , Abhilasha Sancheti , Praneetha Vaddamanu , Aparna Garimella , Apoorv Umang Saxena , Vishwa Vinay
IPC: G06F40/40 , G06V10/22 , G06V10/44 , G06V10/764 , G06V10/774 , G06V10/82
Abstract: One or more aspects of the method, apparatus, and non-transitory computer readable medium include obtaining an image and a detail level, wherein the detail level comprises a value indicating a level of detail for a description of the image. One or more aspects of the method, apparatus, and non-transitory computer readable medium further include identifying a set of regions for the image based on the detail level using a machine learning model, and generating a description for the image based on the set of regions, wherein an amount of detail in the description is based on the detail level.
-
公开(公告)号:US20240428468A1
公开(公告)日:2024-12-26
申请号:US18337634
申请日:2023-06-20
Applicant: Adobe Inc.
Inventor: Aishwarya Agarwal , Srikrishna Karanam , Joseph Koonthanam Jose , Apoorv Umang Saxena , Koustava Goswami , Balaji Vasan Srinivasan
IPC: G06T11/00 , G06N3/0455
Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that utilizes attention segregation loss and/or attention retention loss at inference time of a diffusion neural network to generate a text-conditioned image. In particular, in some embodiments, the disclosed systems utilize the attention segregation loss to reduce overlap between concepts by comparing attention maps for multiple concepts of a text query corresponding to a denoising step. Further, in some embodiments, the disclosed systems utilize the attention retention loss to improve information retention for concepts across denoising steps by comparing attention maps between different denoising steps. Accordingly, in some embodiments, by utilizing the attention segregation loss and the attention retention loss, the disclosed systems accurately maintain multiple concepts from a text query when generating a text-conditioned image.
-