Patent search ap:("Salesforce Page Inc.") AND inv:"Ran Xu"

1.

发明授权
Systems and methods for online adaptation for cross-domain streaming data 有权

公开(公告)号：US12235850B2

公开(公告)日：2025-02-25

申请号：US17588022

申请日：2022-01-28

Applicant: Salesforce, Inc.

Inventor： Luyu Yang , Mingfei Gao , Zeyuan Chen , Ran Xu , Chetan Ramaiah

IPC: G06F16/2455 , G06F16/242 , G06N20/00

Abstract: Embodiments described herein provide an online domain adaptation framework based on cross-domain bootstrapping for online domain adaptation, in which the target domain streaming data is deleted immediately after adapted. At each online query, the data diversity is increased across domains by bootstrapping the source domain to form diverse combinations with the current target query. To fully take advantage of the valuable discrepancies among the diverse combinations, a set of independent learners are trained to preserve the differences. The knowledge of the learners is then integrated by exchanging their predicted pseudo-labels on the current target query to co-supervise the learning on the target domain, but without sharing the weights to maintain the learners' divergence.

2.

发明申请
SYSTEMS AND METHODS FOR LANGUAGE AGENT OPTIMIZATION 有权

公开(公告)号：US20250045567A1

公开(公告)日：2025-02-06

申请号：US18498257

申请日：2023-10-31

Applicant: Salesforce, Inc.

Inventor： Weiran Yao , Shelby Heinecke , Juan Carlos Niebles Duque , Zhiwei Liu , Yihao Feng , Le Xue , Rithesh Murthy , Zeyuan Chen , Jianguo Zhang , Devansh Arpit , Ran Xu , Lik Mui , Huan Wang , Caiming Xiong , Silvio Savarese

IPC: G06N3/0455 , G06N3/092

Abstract: Embodiments described herein provide for optimizing a language model (LM) agent. In at least one embodiment, and LM agent comprises an “actor” LM and a “retrospective LM which provides reflections on attempts by the actor LM. The reflections are used to update subsequent prompts to the actor LM. Optimizing the LM agent comprises fine-tuning parameters of the retrospective LM while keeping parameters of the actor LM frozen. A gradient may be determined by a change in reward from the environment based on actions taken by the actor LM with and without a reflection of the retrospective LM. Using this gradient, parameters of the retrospective LM may be updated via backpropagation.

3.

发明公开
PROCESSING FORMS USING ARTIFICIAL INTELLIGENCE MODELS 审中-公开

公开(公告)号：US20240338961A1

公开(公告)日：2024-10-10

申请号：US18746820

申请日：2024-06-18

Applicant: Salesforce, Inc.

Inventor： Mingfei Gao , Ran Xu

IPC: G06V30/412 , G06F40/174 , G06F40/205 , G06N20/00 , G06V30/19

CPC classification number: G06V30/412 , G06F40/174 , G06F40/205 , G06N20/00 , G06V30/19007

Abstract: An application server may receive an input document including a set of input text fields and an input key phrase querying a value for a key-value pair that corresponds to one or more of the set of input text fields. The application server may extract, using an optical character recognition model, a set of character strings and a set of two-dimensional locations of the set of character strings on a layout of the input document. After extraction, the application server may input the extracted set of character strings and the set of two-dimensional locations into a machine learned model that is trained to compute a probability that a character string corresponds to the value for the key-value pair. The application server may then identify the value for the key-value pair corresponding to the input key phrase and may out the identified value.

4.

发明授权
Systems and methods for vision-language distribution alignment 有权

公开(公告)号：US12112523B2

公开(公告)日：2024-10-08

申请号：US17589725

申请日：2022-01-31

Applicant: Salesforce, Inc.

Inventor： Shu Zhang , Junnan Li , Ran Xu , Caiming Xiong , Chetan Ramaiah

IPC: G06V10/776 , G06F16/56 , G06F16/583 , G06F40/126 , G06F40/166 , G06F40/284 , G06V10/74 , G06V10/80

CPC classification number: G06V10/776 , G06F16/56 , G06F16/5846 , G06F40/126 , G06F40/166 , G06F40/284 , G06V10/761 , G06V10/806

Abstract: Embodiments described herein a CROss-Modal Distribution Alignment (CROMDA) model for vision-language pretraining, which can be used for retrieval downstream tasks. In the CROMDA mode, global cross-modal representations are aligned on each unimodality. Specifically, a uni-modal global similarity between an image/text and the image/text feature queue are computed. A softmax-normalized distribution is then generated based on the computed similarity. The distribution thus takes advantage of property of the global structure of the queue. CROMDA then aligns the two distributions and learns a modal invariant global representation. In this way, CROMDA is able to obtain invariant property in each modality, where images with similar text representations should be similar and vice versa.

5.

发明授权
Systems and methods for partially supervised online action detection in untrimmed videos 有权

公开(公告)号：US12299982B2

公开(公告)日：2025-05-13

申请号：US16931228

申请日：2020-07-16

Applicant: Salesforce, Inc.

Inventor： Mingfei Gao , Yingbo Zhou , Ran Xu , Caiming Xiong

IPC: G06V20/40 , G06F17/18 , G06F18/2113 , G06F18/214 , G06F18/2431 , G06N3/084 , G06V10/764 , G06V10/82 , G06V20/20

Abstract: Embodiments described herein provide systems and methods for a partially supervised training model for online action detection. Specifically, the online action detection framework may include two modules that are trained jointly—a Temporal Proposal Generator (TPG) and an Online Action Recognizer (OAR). In the training phase, OAR performs both online per-frame action recognition and start point detection. At the same time, TPG generates class-wise temporal action proposals serving as noisy supervisions for OAR. TPG is then optimized with the video-level annotations. In this way, the online action detection framework can be trained with video-category labels only without pre-annotated segment-level boundary labels.

6.

发明申请
SYSTEMS AND METHODS FOR ARTIFICIAL INTELLIGENCE AGENTS 有权

公开(公告)号：US20250139411A1

公开(公告)日：2025-05-01

申请号：US18498229

申请日：2023-10-31

Applicant: Salesforce, Inc.

Inventor： Rithesh Murthy , Shelby Heinecke , Juan Carlos Niebles Duque , Zhiwei Liu , Le Xue , Weiran Yao , Yihao Feng , Zeyuan Chen , Akash Gokul , Devansh Arpit , Ran Xu , Lik Mui , Huan Wang , Caiming Xiong , Silvio Savarese

IPC: G06N3/0455 , G06N3/084

Abstract: Embodiments described herein provide a large language model (LLM) based AI agent that adopts Monte-Carlo Tree Search (MCTS) to execute a task. The LLM is prompted with a task description and it responds with its first attempted list of actions. Based on the success or failure of the first attempt, the LLM is prompted with an updated prompt which includes feedback from the first attempt based on a determined reward. The prompt may include a relative “score” for each action taken at each step. A numeric score may be mapped to a set of pre-defined text labels, such as “high” or “low” value putting the score in a form more suited for an LLM prompt. In this way, the LLM is iteratively given prompts which are updated with the scores from each action taken at each previous iterations so that it traverses different paths on the tree in each iteration.

7.

发明授权
Systems and methods for field extraction from unlabeled data 有权

公开(公告)号：US12086698B2

公开(公告)日：2024-09-10

申请号：US17484618

申请日：2021-09-24

Applicant: Salesforce, Inc.

Inventor： Mingfei Gao , Zeyuan Chen , Ran Xu

IPC: G06N20/20 , G06N3/084 , G06N5/01 , G06N5/04 , G06V30/412 , G06V30/413

CPC classification number: G06N20/20 , G06N3/084 , G06N5/01 , G06N5/04 , G06V30/412 , G06V30/413

Abstract: A field extraction system that does not require field-level annotations for training is provided. Specifically, the training process is bootstrapped by mining pseudo-labels from unlabeled forms using simple rules. Then, a transformer-based structure is used to model interactions between text tokens in the input form and predict a field tag for each token accordingly. The pseudo-labels are used to supervise the transformer training. As the pseudo-labels are noisy, a refinement module that contains a sequence of branches is used to refine the pseudo-labels. Each of the refinement branches conducts field tagging and generates refined labels. At each stage, a branch is optimized by the labels ensembled from all previous branches to reduce label noise.

8.

发明申请
SYSTEMS AND METHODS FOR CONTROLLABLE DATA GENERATION FROM TEXT 有权

公开(公告)号：US20250068901A1

公开(公告)日：2025-02-27

申请号：US18423081

申请日：2024-01-25

Applicant: Salesforce, Inc.

Inventor： Shiyu Wang , Yihao Feng , Tian Lan , Ning Yu , Yu Bai , Ran Xu , Huan Wang , Caiming Xiong , Silvio Savarese

IPC: G06N3/08

Abstract: Embodiments described herein provide a diffusion-based framework that is trained on a dataset with limited text labels, to generate a distribution of data samples in the dataset given a specific text description label. Specifically, firstly, unlabeled data is used to train the diffusion model to generate a data distribution of data samples given a specific text description label. Then text-labeled data samples are used to finetune the diffusion model to generate data distribution given a specific text description label, thus enhancing controllability of training.

9.

发明公开
SYSTEMS AND METHODS FOR ATTENTION MECHANISM IN THREE-DIMENSIONAL OBJECT DETECTION 审中-公开

公开(公告)号：US20240169746A1

公开(公告)日：2024-05-23

申请号：US18161661

申请日：2023-01-30

Applicant: Salesforce, Inc.

Inventor： Manli Shu , Le Xue , Ning Yu , Roberto Martín-Martín , Juan Carlos Niebles Duque , Caiming Xiong , Ran Xu

IPC: G06V20/64 , G06T3/40 , G06V10/46 , G06V10/82

CPC classification number: G06V20/64 , G06T3/4007 , G06V10/46 , G06V10/82

Abstract: Embodiments described herein provide a system for three-dimensional (3D) object detection. The system includes an input interface configured to obtain 3D point data describing spatial information of a plurality of points, and a memory storing a neural network based 3D object detection model having an encoder and a decoder. The system also includes processors to perform operations including: encoding, by the encoder, a first set of coordinates into a first set of point features and a set of object features; sampling a second set of point features from the first set of point features; generating, by attention layers at the decoder, a set of attention weights by applying cross-attention over at least the set of object features and the second set of point feature, and generate, by the decoder, a predicted bounding box among the plurality of points based on at least in part on the set of attention weights.

10.

发明公开
SYSTEMS AND METHODS FOR OPEN VOCABULARY INSTANCE SEGMENTATION IN UNANNOTATED IMAGES 审中-公开

公开(公告)号：US20240070868A1

公开(公告)日：2024-02-29

申请号：US18159318

申请日：2023-01-25

Applicant: Salesforce, Inc.

Inventor： Ning Yu , Vibashan Vishnukumar Sharmini , Chen Xing , Juan Carlos Niebles Duque , Ran Xu

IPC: G06T7/11 , G06V10/26

CPC classification number: G06T7/11 , G06V10/273

Abstract: Embodiments described herein provide an open-vocabulary instance segmentation framework that adopts a pre-trained vision-language model to develop a pipeline in detecting novel categories of instances.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification