-
公开(公告)号:US12235850B2
公开(公告)日:2025-02-25
申请号:US17588022
申请日:2022-01-28
Applicant: Salesforce, Inc.
Inventor: Luyu Yang , Mingfei Gao , Zeyuan Chen , Ran Xu , Chetan Ramaiah
IPC: G06F16/2455 , G06F16/242 , G06N20/00
Abstract: Embodiments described herein provide an online domain adaptation framework based on cross-domain bootstrapping for online domain adaptation, in which the target domain streaming data is deleted immediately after adapted. At each online query, the data diversity is increased across domains by bootstrapping the source domain to form diverse combinations with the current target query. To fully take advantage of the valuable discrepancies among the diverse combinations, a set of independent learners are trained to preserve the differences. The knowledge of the learners is then integrated by exchanging their predicted pseudo-labels on the current target query to co-supervise the learning on the target domain, but without sharing the weights to maintain the learners' divergence.
-
公开(公告)号:US12112523B2
公开(公告)日:2024-10-08
申请号:US17589725
申请日:2022-01-31
Applicant: Salesforce, Inc.
Inventor: Shu Zhang , Junnan Li , Ran Xu , Caiming Xiong , Chetan Ramaiah
IPC: G06V10/776 , G06F16/56 , G06F16/583 , G06F40/126 , G06F40/166 , G06F40/284 , G06V10/74 , G06V10/80
CPC classification number: G06V10/776 , G06F16/56 , G06F16/5846 , G06F40/126 , G06F40/166 , G06F40/284 , G06V10/761 , G06V10/806
Abstract: Embodiments described herein a CROss-Modal Distribution Alignment (CROMDA) model for vision-language pretraining, which can be used for retrieval downstream tasks. In the CROMDA mode, global cross-modal representations are aligned on each unimodality. Specifically, a uni-modal global similarity between an image/text and the image/text feature queue are computed. A softmax-normalized distribution is then generated based on the computed similarity. The distribution thus takes advantage of property of the global structure of the queue. CROMDA then aligns the two distributions and learns a modal invariant global representation. In this way, CROMDA is able to obtain invariant property in each modality, where images with similar text representations should be similar and vice versa.
-