-
公开(公告)号:US20240282107A1
公开(公告)日:2024-08-22
申请号:US18452424
申请日:2023-08-18
Applicant: Apple Inc.
Inventor: Jia Huang , Robert J. Monarch , Alex Jungho Kim , Jungsuk Kwac , Parmeshwar Khurd , Kailash Thiyagarajan , Xiaoyuan Goodman Gu
IPC: G06V20/30 , G06F16/53 , G06V10/774
CPC classification number: G06V20/30 , G06F16/53 , G06V10/774 , G06V10/82
Abstract: The present technology pertains to a multi-modal transformer model that is designed and trained to perform cross-modal tasks such as image-text matching, wherein the model is further refined with data for the particular downstream use case of the model. More specifically, the present technology can refine the underlying model with labeled examples derived from a dataset of text-image pairs that ultimately achieved a desired interaction in the proper context. For example, in the use case of advertising applications in an App store, the present technology can refine the underlying model with examples of images used to advertise applications in the App store where the respective invitational content was clicked or converted.