-
公开(公告)号:US11562142B2
公开(公告)日:2023-01-24
申请号:US17187608
申请日:2021-02-26
Applicant: salesforce.com, inc.
Inventor: Erik Lennart Nijkamp , Caiming Xiong
IPC: G06F40/279 , G06F40/58
Abstract: A machine learning based model generates a feature representation of a text sequence, for example, a natural language sentence or phrase. The system trains the machine learning based model by receiving an input text sequence and perturbing the input text sequence by masking a subset of tokens. The machine learning based model is used to predict the masked tokens. A predicted text sequence is generated based on the predictions of the masked tokens. The system processes the predicted text sequence using the machine learning based model to determine whether a token was predicted or an original token. The parameters of the machine learning based model are adjusted to minimize an aggregate loss based on prediction of the correct word for a masked token and a classification of a word as original or replaced.
-
公开(公告)号:US20220277141A1
公开(公告)日:2022-09-01
申请号:US17187608
申请日:2021-02-26
Applicant: salesforce.com, inc.
Inventor: Erik Lennart Nijkamp , Caiming Xiong
IPC: G06F40/279 , G06F40/58
Abstract: A machine learning based model generates a feature representation of a text sequence, for example, a natural language sentence or phrase. The system trains the machine learning based model by receiving an input text sequence and perturbing the input text sequence by masking a subset of tokens. The machine learning based model is used to predict the masked tokens. A predicted text sequence is generated based on the predictions of the masked tokens. The system processes the predicted text sequence using the machine learning based model to determine whether a token was predicted or an original token. The parameters of the machine learning based model are adjusted to minimize an aggregate loss based on prediction of the correct word for a masked token and a classification of a word as original or replaced.
-