-
公开(公告)号:US12198685B2
公开(公告)日:2025-01-14
申请号:US18184432
申请日:2023-03-15
Applicant: PAYPAL, INC.
Inventor: Sandro Cavallari , Yuzhen Zhuo , Van Hoang Nguyen , Quan Jin Ferdinand Tang , Gautam Vasappanavara
IPC: G06F40/00 , G06F40/205 , G06F40/253 , G06F40/284 , G10L15/19 , G10L15/26
Abstract: Methods and systems are presented for translating informal utterances into formal texts. Informal utterances may include words in abbreviation forms or typographical errors. The informal utterances may be processed by mapping each word in an utterance into a well-defined token. The mapping from the words to the tokens may be based on a context associated with the utterance derived by analyzing the utterance in a character-by-character basis. The token that is mapped for each word can be one of a vocabulary token that corresponds to a formal word in a pre-defined word corpus, an unknown token that corresponds to an unknown word, or a masked token. Formal text may then be generated based on the mapped tokens. Through the processing of informal utterances using the techniques disclosed herein, the informal utterances are both normalized and sanitized.
-
公开(公告)号:US20230290344A1
公开(公告)日:2023-09-14
申请号:US18184432
申请日:2023-03-15
Applicant: PAYPAL, INC.
Inventor: Sandro Cavallari , Yuzhen Zhuo , Van Hoang Nguyen , Quan Jin Ferdinand Tang , Gautam Vasappanavara
IPC: G06F40/284 , G06F40/205
CPC classification number: G06F40/284 , G06F40/205
Abstract: Methods and systems are presented for translating informal utterances into formal texts. Informal utterances may include words in abbreviation forms or typographical errors. The informal utterances may be processed by mapping each word in an utterance into a well-defined token. The mapping from the words to the tokens may be based on a context associated with the utterance derived by analyzing the utterance in a character-by-character basis. The token that is mapped for each word can be one of a vocabulary token that corresponds to a formal word in a pre-defined word corpus, an unknown token that corresponds to an unknown word, or a masked token. Formal text may then be generated based on the mapped tokens. Through the processing of informal utterances using the techniques disclosed herein, the informal utterances are both normalized and sanitized.
-
公开(公告)号:US11610582B2
公开(公告)日:2023-03-21
申请号:US16831058
申请日:2020-03-26
Applicant: PAYPAL, INC.
Inventor: Sandro Cavallari , Yuzhen Zhuo , Van Hoang Nguyen , Quan Jin Ferdinand Tang , Gautam Vasappanavara
IPC: G06F40/00 , G10L15/19 , G06F40/253 , G06F40/284 , G10L15/26
Abstract: Methods and systems are presented for translating informal utterances into formal texts. Informal utterances may include words in abbreviation forms or typographical errors. The informal utterances may be processed by mapping each word in an utterance into a well-defined token. The mapping from the words to the tokens may be based on a context associated with the utterance derived by analyzing the utterance in a character-by-character basis. The token that is mapped for each word can be one of a vocabulary token that corresponds to a formal word in a pre-defined word corpus, an unknown token that corresponds to an unknown word, or a masked token. Formal text may then be generated based on the mapped tokens. Through the processing of informal utterances using the techniques disclosed herein, the informal utterances are both normalized and sanitized.
-
公开(公告)号:US20210304741A1
公开(公告)日:2021-09-30
申请号:US16831058
申请日:2020-03-26
Applicant: PAYPAL, INC.
Inventor: Sandro Cavallari , Yuzhen Zhuo , Van Hoang Nguyen , Quan Jin Ferdinand Tang , Gautam Vasappanavara
IPC: G10L15/19 , G10L15/26 , G06F40/284 , G06F40/253
Abstract: Methods and systems are presented for translating informal utterances into formal texts. Informal utterances may include words in abbreviation forms or typographical errors. The informal utterances may be processed by mapping each word in an utterance into a well-defined token. The mapping from the words to the tokens may be based on a context associated with the utterance derived by analyzing the utterance in a character-by-character basis. The token that is mapped for each word can be one of a vocabulary token that corresponds to a formal word in a pre-defined word corpus, an unknown token that corresponds to an unknown word, or a masked token. Formal text may then be generated based on the mapped tokens. Through the processing of informal utterances using the techniques disclosed herein, the informal utterances are both normalized and sanitized.
-
-
-