-
公开(公告)号:US11348571B2
公开(公告)日:2022-05-31
申请号:US16810070
申请日:2020-03-05
Inventor: Shiqiang Ding , Jizhou Huang , Zhongwei Jiang , Wentao Ma
Abstract: The present disclosure provides methods, computing devices, and storage media for generating a training corpus. The method includes: mining out pieces of data from user behavior logs associated with a target application, each piece of data including a first behavior log and a second behavior log, the first behavior log including a user speech and a corresponding speech recognition result, the second behavior log belonging to the same user as the first behavior log and time-dependent with the first behavior log; and determining the user speech and the corresponding speech recognition result in each piece of data as a positive feedback sample or a negative feedback sample, based on the first behavior log and the second behavior log.