Patent search ap:("salesforce.com Page inc.") AND inv:"Tong Niu"

1.

发明授权
Systems and methods for unsupervised autoregressive text compression 有权

公开(公告)号：US11487939B2

公开(公告)日：2022-11-01

申请号：US16549985

申请日：2019-08-23

Applicant: salesforce.com, inc.

Inventor： Tong Niu , Caiming Xiong , Richard Socher

IPC: G06F40/284 , G06N3/08 , H03M7/42 , H03M7/30 , G06F40/40

Abstract: Embodiments described herein provide a provide a fully unsupervised model for text compression. Specifically, the unsupervised model is configured to identify an optimal deletion path for each input sequence of texts (e.g., a sentence) and words from the input sequence are gradually deleted along the deletion path. To identify the optimal deletion path, the unsupervised model may adopt a pretrained bidirectional language model (BERT) to score each candidate deletion based on the average perplexity of the resulting sentence and performs a simple greedy look-ahead tree search to select the best deletion for each step.

2.

发明授权
Systems and methods for cross-lingual transfer in natural language processing 有权

公开(公告)号：US12164878B2

公开(公告)日：2024-12-10

申请号：US17581380

申请日：2022-01-21

Applicant: Salesforce.com, Inc.

Inventor： Tong Niu , Kazuma Hashimoto , Yingbo Zhou , Caiming Xiong

IPC: G06F40/51

Abstract: Embodiments described herein provide a cross-lingual sentence alignment framework that is trained only on rich-resource language pairs. To obtain an accurate aligner, a pretrained multi-lingual language model is used, and a classifier is trained on parallel data from rich-resource language pairs. This trained classifier may then be used for cross-lingual transfer with low-resource languages.

3.

发明授权
Systems and methods for unsupervised paraphrase generation 有权

公开(公告)号：US11829721B2

公开(公告)日：2023-11-28

申请号：US17161214

申请日：2021-01-28

Applicant: salesforce.com, inc.

Inventor： Tong Niu , Semih Yavuz , Yingbo Zhou , Nitish Shirish Keskar , Huan Wang , Caiming Xiong

IPC: G10L15/065 , G06N3/0455 , G06F18/20 , G06F40/20 , G06F40/289 , G06F40/45 , G06F40/284 , G06F40/242 , G06F18/22 , G06F18/214 , G06N7/01

CPC classification number: G06F40/284 , G06F18/214 , G06F18/22 , G06F40/242 , G06N7/01

Abstract: Embodiments described herein provide dynamic blocking, a decoding algorithm which enables large-scale pretrained language models to generate high-quality paraphrases in an un-supervised setting. Specifically, in order to obtain an alternative surface form, when the language model emits a token that is present in the source sequence, the language model is prevented from generating the next token that is the same as the subsequent source token in the source sequence at the next time step. In this way, the language model is forced to generate a paraphrased sequence of the input source sequence, but with mostly different wording.

4.

发明公开
SYSTEMS AND METHODS FOR CROSS-LINGUAL TRANSFER IN NATURAL LANGUAGE PROCESSING 审中-公开

公开(公告)号：US20230153542A1

公开(公告)日：2023-05-18

申请号：US17581380

申请日：2022-01-21

Applicant: salesforce.com, inc.

Inventor： Tong Niu , Kazuma Hashimoto , Yingbo Zhou , Caiming Xiong

IPC: G06F40/51

CPC classification number: G06F40/51

Abstract: Embodiments described herein provide a cross-lingual sentence alignment framework that is trained only on rich-resource language pairs. To obtain an accurate aligner, a pretrained multi-lingual language model is used, and a classifier is trained on parallel data from rich-resource language pairs. This trained classifier may then be used for cross-lingual transfer with low-resource languages.

5.

发明申请
SYSTEMS AND METHODS FOR UNSUPERVISED PARAPHRASE GENERATION 有权

公开(公告)号：US20220129629A1

公开(公告)日：2022-04-28

申请号：US17161214

申请日：2021-01-28

Applicant: salesforce.com, inc.

Inventor： Tong Niu , Semih Yavuz , Yingbo Zhou , Nitish Shirish Keskar , Huan Wang , Caiming Xiong

IPC: G06F40/284 , G06F40/242 , G06K9/62 , G06N7/00

Abstract: Embodiments described herein provide dynamic blocking, a decoding algorithm which enables large-scale pretrained language models to generate high-quality paraphrases in an un-supervised setting. Specifically, in order to obtain an alternative surface form, when the language model emits a token that is present in the source sequence, the language model is prevented from generating the next token that is the same as the subsequent source token in the source sequence at the next time step. In this way, the language model is forced to generate a paraphrased sequence of the input source sequence, but with mostly different wording.

6.

发明申请
SYSTEMS AND METHODS FOR A K-NEAREST NEIGHBOR BASED MECHANISM OF NATURAL LANGUAGE PROCESSING MODELS 有权

公开(公告)号：US20210374488A1

公开(公告)日：2021-12-02

申请号：US17090553

申请日：2020-11-05

Applicant: salesforce.com, inc.

Inventor： Nazneen Rajani , Tong Niu , Wenpeng Yin

IPC: G06K9/62 , G06N3/08 , G06N3/063

Abstract: Embodiments described herein adopts a k nearest neighbor (kNN) mechanism over a model's hidden representations to identify training examples closest to a given test example. Specifically, a training set of sequences and a test sequence are received, each of which is mapped to a respective hidden representation vector using a base model. A set of indices for each sequence index that minimizes a distance between the respective hidden state vector and a test hidden state vector is then determined A weighted k-nearest neighbor probability score can then be computed from the set of indices to generate a probability distribution over labels for the test sequence.

Patent Agency Ranking