Patent search ap:("salesforce.com Page inc.") AND inv:"Mingfei Gao"

1.

发明公开
SYSTEMS AND METHODS FOR OPEN VOCABULARY OBJECT DETECTION 审中-公开

公开(公告)号：US20230154213A1

公开(公告)日：2023-05-18

申请号：US17587161

申请日：2022-01-28

Applicant: salesforce.com, inc.

Inventor： Mingfei Gao , Chen Xing

IPC: G06V20/62 , G06T9/00 , G06V10/22 , G06V10/774 , G06V10/77 , G06T1/60 , G06F40/126

CPC classification number: G06V20/635 , G06F40/126 , G06T1/60 , G06T9/00 , G06V10/225 , G06V10/7715 , G06V10/7747

Abstract: Embodiments described herein provide methods and systems for open vocabulary object detection of images. given a pre-trained vision-language model and an image-caption pair, an activation map may be computed in the image that corresponds to an object of interest mentioned in the caption. The activation map is then converted into a pseudo bounding-box label for the corresponding object category. The open vocabulary detector is then directly supervised by these pseudo box-labels, which enables training object detectors with no human-provided bounding-box annotations.

2.

发明公开
SYSTEMS AND METHODS FOR ONLINE ADAPTATION FOR CROSS-DOMAIN STREAMING DATA 审中-公开

公开(公告)号：US20230153307A1

公开(公告)日：2023-05-18

申请号：US17588022

申请日：2022-01-28

Applicant: salesforce.com, inc.

Inventor： Luyu Yang , Mingfei Gao , Zeyuan Chen , Ran Xu , Chetan Ramaiah

IPC: G06F16/2455 , G06F16/242 , G06N20/00

CPC classification number: G06F16/24568 , G06F16/2425 , G06N20/00

Abstract: Embodiments described herein provide an online domain adaptation framework based on cross-domain bootstrapping for online domain adaptation, in which the target domain streaming data is deleted immediately after adapted. At each online query, the data diversity is increased across domains by bootstrapping the source domain to form diverse combinations with the current target query. To fully take advantage of the valuable discrepancies among the diverse combinations, a set of independent learners are trained to preserve the differences. The knowledge of the learners is then integrated by exchanging their predicted pseudo-labels on the current target query to co-supervise the learning on the target domain, but without sharing the weights to maintain the learners' divergence.

3.

发明申请
SYSTEMS AND METHODS FOR FIELD EXTRACTION FROM UNLABELED DATA 有权

公开(公告)号：US20220374631A1

公开(公告)日：2022-11-24

申请号：US17484618

申请日：2021-09-24

Applicant: salesforce.com, inc.

Inventor： Mingfei Gao , Zeyuan Chen , Ran Xu

IPC: G06K9/00 , G06N3/08

Abstract: Embodiments described a field extraction system that does not require field-level annotations for training. Specifically, the training process is bootstrapped by mining pseudo-labels from unlabeled forms using simple rules. Then, a transformer-based structure is used to model interactions between text tokens in the input form and predict a field tag for each token accordingly. The pseudo-labels are used to supervise the transformer training. As the pseudo-labels are noisy, a refinement module that contains a sequence of branches is used to refine the pseudo-labels. Each of the refinement branches conducts field tagging and generates refined labels. At each stage, a branch is optimized by the labels ensembled from all previous branches to reduce label noise.

4.

发明申请
PROCESSING FORMS USING ARTIFICIAL INTELLIGENCE MODELS 有权

公开(公告)号：US20230133690A1

公开(公告)日：2023-05-04

申请号：US17453070

申请日：2021-11-01

Applicant: Salesforce.com, inc.

Inventor： Mingfei Gao , Ran Xu

IPC: G06K9/00 , G06F40/174 , G06F40/205 , G06N20/00

Abstract: An application server may receive an input document including a set of input text fields and an input key phrase querying a value for a key-value pair that corresponds to one or more of the set of input text fields. The application server may extract, using an optical character recognition model, a set of character strings and a set of two-dimensional locations of the set of character strings on a layout of the input document. After extraction, the application server may input the extracted set of character strings and the set of two-dimensional locations into a machine learned model that is trained to compute a probability that a character string corresponds to the value for the key-value pair. The application server may then identify the value for the key-value pair corresponding to the input key phrase and may out the identified value.

5.

发明授权
Two-stage online detection of action start in untrimmed videos 有权

公开(公告)号：US10902289B2

公开(公告)日：2021-01-26

申请号：US16394992

申请日：2019-04-25

Applicant: salesforce.com, inc.

Inventor： Mingfei Gao , Richard Socher , Caiming Xiong

IPC: G06K9/62 , G06K9/00

Abstract: Embodiments described herein provide a two-stage online detection of action start system including a classification module and a localization module. The classification module generates a set of action scores corresponding to a first video frame from the video, based on the first video frame and video frames before the first video frames in the video. Each action score indicating a respective probability that the first video frame contains a respective action class. The localization module is coupled to the classification module for receiving the set of action scores from the classification module and generating an action-agnostic start probability that the first video frame contains an action start. A fusion component is coupled to the localization module and the localization module for generating, based on the set of action scores and the action-agnostic start probability, a set of action-specific start probabilities, each action-specific start probability corresponding to a start of an action belonging to the respective action class.

6.

发明申请
Two-Stage Online Detection of Action Start In Untrimmed Videos 审中-公开

公开(公告)号：US20200302236A1

公开(公告)日：2020-09-24

申请号：US16394992

申请日：2019-04-25

Applicant: Salesforce.com, Inc,

Inventor： Mingfei Gao , Richard Socher , Caiming Xiong

IPC: G06K9/62 , G06K9/00

Abstract: Embodiments described herein provide a two-stage online detection of action start system including a classification module and a localization module. The classification module generates a set of action scores corresponding to a first video frame from the video, based on the first video frame and video frames before the first video frames in the video. Each action score indicating a respective probability that the first video frame contains a respective action class. The localization module is coupled to the classification module for receiving the set of action scores from the classification module and generating an action-agnostic start probability that the first video frame contains an action start. A fusion component is coupled to the localization module and the localization module for generating, based on the set of action scores and the action-agnostic start probability, a set of action-specific start probabilities, each action-specific start probability corresponding to a start of an action belonging to the respective action class.

7.

发明授权
Weakly supervised natural language localization networks for video proposal prediction based on a text query 有权

公开(公告)号：US11687588B2

公开(公告)日：2023-06-27

申请号：US16531343

申请日：2019-08-05

Applicant: salesforce.com, inc.

Inventor： Mingfei Gao , Richard Socher , Caiming Xiong

IPC: G06F16/735 , G06F16/73 , G06V10/82 , G06F16/74 , G06V20/40 , G06F17/10 , G06N3/08 , G06F40/47 , G06F18/21 , G06V10/44

CPC classification number: G06F16/735 , G06F16/73 , G06F17/10 , G06F18/2185 , G06F40/47 , G06N3/08 , G06V10/82 , G06V20/41 , G06V20/49 , G06V10/454 , G06V20/44 , G06V20/46

Abstract: Systems and methods are provided for weakly supervised natural language localization (WSNLL), for example, as implemented in a neural network or model. The WSNLL network is trained with long, untrimmed videos, i.e., videos that have not been temporally segmented or annotated. The WSNLL network or model defines or generates a video-sentence pair, which corresponds to a pairing of an untrimmed video with an input text sentence. According to some embodiments, the WSNLL network or model is implemented with a two-branch architecture, where one branch performs segment sentence alignment and the other one conducts segment selection. These methods and systems are specifically used to predict how a video proposal matches a text query using respective visual and text features.

8.

发明申请
IMAGE ANALYSIS BASED DOCUMENT PROCESSING FOR INFERENCE OF KEY-VALUE PAIRS IN NON-FIXED DIGITAL DOCUMENTS 有权

公开(公告)号：US20220215195A1

公开(公告)日：2022-07-07

申请号：US17140987

申请日：2021-01-04

Applicant: salesforce.com, inc.

Inventor： Mingfei Gao , Zeyuan Chen , Le Xue , Ran Xu , Caiming Xiong

IPC: G06K9/00 , G06F40/289 , G06F40/186

Abstract: An online system extracts information from non-fixed form documents. The online system receives an image of a form document and obtains a set of phrases and locations of the set of phrases on the form image. For at least one field, the online system determines key scores for the set of phrases. The online system identifies a set of candidate values for the field from the set of identified phrases and identifies a set of neighbors for each candidate value from the set of identified phrases. The online system determines neighbor scores, where a neighbor score for a candidate value and a respective neighbor is determined based on the key score for the neighbor and a spatial relationship of the neighbor to the candidate value. The online system selects a candidate value and a respective neighbor based on the neighbor score as the value and key for the field.

9.

发明申请
Two-Stage Online Detection of Action Start In Untrimmed Videos 审中-公开

公开(公告)号：US20200302178A1

公开(公告)日：2020-09-24

申请号：US16394964

申请日：2019-04-25

Applicant: salesforce.com, inc.

Inventor： Mingfei Gao , Richard Socher , Caiming Xiong

IPC: G06K9/00 , G06K9/62 , G06N3/04

Abstract: Embodiments described herein provide a two-stage online detection of action start system including a classification module and a localization module. The classification module generates a set of action scores corresponding to a first video frame from the video, based on the first video frame and video frames before the first video frames in the video. Each action score indicating a respective probability that the first video frame contains a respective action class. The localization module is coupled to the classification module for receiving the set of action scores from the classification module and generating an action-agnostic start probability that the first video frame contains an action start. A fusion component is coupled to the localization module and the localization module for generating, based on the set of action scores and the action-agnostic start probability, a set of action-specific start probabilities, each action-specific start probability corresponding to a start of an action belonging to the respective action class.

10.

发明授权
Image analysis based document processing for inference of key-value pairs in non-fixed digital documents 有权

公开(公告)号：US11699297B2

公开(公告)日：2023-07-11

申请号：US17140987

申请日：2021-01-04

Applicant: salesforce.com, inc.

Inventor： Mingfei Gao , Zeyuan Chen , Le Xue , Ran Xu , Caiming Xiong

IPC: G06V30/413 , G06F40/186 , G06F40/289 , G06V30/412 , G06F40/295 , G06V30/10 , G06V10/40

CPC classification number: G06V30/413 , G06F40/186 , G06F40/289 , G06V30/412 , G06F40/295 , G06V10/40 , G06V30/10

Abstract: An online system extracts information from non-fixed form documents. The online system receives an image of a form document and obtains a set of phrases and locations of the set of phrases on the form image. For at least one field, the online system determines key scores for the set of phrases. The online system identifies a set of candidate values for the field from the set of identified phrases and identifies a set of neighbors for each candidate value from the set of identified phrases. The online system determines neighbor scores, where a neighbor score for a candidate value and a respective neighbor is determined based on the key score for the neighbor and a spatial relationship of the neighbor to the candidate value. The online system selects a candidate value and a respective neighbor based on the neighbor score as the value and key for the field.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification