-
公开(公告)号:US20200294489A1
公开(公告)日:2020-09-17
申请号:US16810070
申请日:2020-03-05
Inventor: Shiqiang DING , Jizhou HUANG , Zhongwei JIANG , Wentao MA
Abstract: The present disclosure provides methods, computing devices, and storage media for generating a training corpus. The method includes: mining out pieces of data from user behavior logs associated with a target application, each piece of data including a first behavior log and a second behavior log, the first behavior log including a user speech and a corresponding speech recognition result, the second behavior log belonging to the same user as the first behavior log and time-dependent with the first behavior log; and determining the user speech and the corresponding speech recognition result in each piece of data as a positive feedback sample or a negative feedback sample, based on the first behavior log and the second behavior log.