Patent search ap:("salesforce.com Page inc.") AND inv:"Shu Zhang"

1.

发明公开
SYSTEMS AND METHODS FOR VISION-LANGUAGE DISTRIBUTION ALIGNMENT 审中-公开

公开(公告)号：US20230162490A1

公开(公告)日：2023-05-25

申请号：US17589725

申请日：2022-01-31

Applicant: salesforce.com, inc.

Inventor： Shu Zhang , Junnan Li , Ran Xu , Caiming Xiong , Chetan Ramaiah

IPC: G06V10/776 , G06V10/74 , G06F40/284 , G06F40/166 , G06F40/126 , G06V10/80 , G06F16/583 , G06F16/56

CPC classification number: G06V10/776 , G06V10/761 , G06F40/284 , G06F40/166 , G06F40/126 , G06V10/806 , G06F16/5846 , G06F16/56

Abstract: Embodiments described herein a CROss-Modal Distribution Alignment (CROMDA) model for vision-language pretraining, which can be used for retrieval downstream tasks. In the CROMDA mode, global cross-modal representations are aligned on each unimodality. Specifically, a uni-modal global similarity between an image/text and the image/text feature queue are computed. A softmax-normalized distribution is then generated based on the computed similarity. The distribution thus takes advantage of property of the global structure of the queue. CROMDA then aligns the two distributions and learns a modal invariant global representation. In this way, CROMDA is able to obtain invariant property in each modality, where images with similar text representations should be similar and vice versa.

2.

发明授权
Template-based key-value extraction for inferring OCR key values within form images 有权

公开(公告)号：US11495011B2

公开(公告)日：2022-11-08

申请号：US16988536

申请日：2020-08-07

Applicant: salesforce.com, inc.

Inventor： Shu Zhang , Chetan Ramaiah , Ran Xu , Caiming Xiong

IPC: G06K9/00 , G06V10/75 , G06K9/62 , G06V10/22 , G06V30/414 , G06V30/10

Abstract: The system has a form analysis module that receives an image of a form into which values have been filled for the possible fields of information on the form, such as first name, address, age, and the like. By using a library of form templates, a form analysis module allows both flexibility of form processing and simplicity for the user. That is, the techniques used by the form analysis module allow the processing of any form image for which the library has a form template example. The form image need not precisely match any form template, but rather may be scaled or shifted relative to a corresponding template. Additionally, the user need only provide the form image itself, without providing any additional exemplars, metadata for training, or the like.

3.

发明申请
SYSTEMS AND METHODS FOR HIERARCHICAL MULTI-LABEL CONTRASTIVE LEARNING 有权

公开(公告)号：US20220300761A1

公开(公告)日：2022-09-22

申请号：US17328779

申请日：2021-05-24

Applicant: salesforce.com, inc.

Inventor： Shu Zhang , Chetan Ramaiah , Caiming Xiong , Ran Xu

IPC: G06K9/62 , G06N3/08

Abstract: Embodiments described herein provide a hierarchical multi-label framework to learn an embedding function that may capture the hierarchical relationship between classes at different levels in the hierarchy. Specifically, supervised contrastive learning framework may be extended to the hierarchical multi-label setting. Each data point has multiple dependent labels, and the relationship between labels is represented as a hierarchy of labels. The relationship between the different levels of labels may then be learnt by a contrastive learning framework.

Patent Agency Ranking