Caption assisted calling to maintain connection in challenging network conditions

    公开(公告)号:US11563784B2

    公开(公告)日:2023-01-24

    申请号:US17345703

    申请日:2021-06-11

    摘要: Systems are provided for managing and coordinating STT/TTS systems and the communications between these systems when they are connected in online meetings and for mitigating connectivity issues that may arise during the online meetings to provide a seamless and reliable meeting experience with either live captions and/or rendered audio. Initially, online meeting communications are transmitted over a lossy connectionless type protocol/channel. Then, in response to detected connectivity problems with one or more systems involved in the online meeting, which can cause jitter or packet loss, for example, an instruction is dynamically generated and processed for causing one or more of the connected systems to transmit and/or process the online meeting content with a more reliable connection/protocol, such as a connection-oriented protocol. Codecs at the systems are used, when needed to convert speech to text with related speech attribute information and to convert text to speech.

    Adversarial speaker adaptation
    5.
    发明授权

    公开(公告)号:US11107460B2

    公开(公告)日:2021-08-31

    申请号:US16460027

    申请日:2019-07-02

    摘要: Embodiments are associated with a speaker-independent acoustic model capable of classifying senones based on input speech frames and on first parameters of the speaker-independent acoustic model, a speaker-dependent acoustic model capable of classifying senones based on input speech frames and on second parameters of the speaker-dependent acoustic model, and a discriminator capable of receiving data from the speaker-dependent acoustic model and data from the speaker-independent acoustic model and outputting a prediction of whether received data was generated by the speaker-dependent acoustic model based on third parameters. The second parameters are initialized based on the first parameters, the second parameters are trained based on input frames of a target speaker to minimize a senone classification loss associated with the second parameters, a portion of the second parameters are trained based on the input frames of the target speaker to maximize a discrimination loss associated with the discriminator, and the third parameters are trained based on the input frames of the target speaker to minimize the discrimination loss.

    SPEECH RECOGNITION ERROR DIAGNOSIS
    8.
    发明申请
    SPEECH RECOGNITION ERROR DIAGNOSIS 有权
    语音识别错误诊断

    公开(公告)号:US20160253989A1

    公开(公告)日:2016-09-01

    申请号:US14634714

    申请日:2015-02-27

    IPC分类号: G10L15/01 G10L15/26 G10L15/19

    CPC分类号: G10L15/01 G10L15/183

    摘要: Techniques and technologies for diagnosing speech recognition errors are described. In an example implementation, a system for diagnosing speech recognition errors may include an error detection module configured to determine that a speech recognition result is least partially erroneous, and a recognition error diagnostics module. The recognition error diagnostics module may be configured to (a) perform a first error analysis of the at least partially erroneous speech recognition result to provide a first error analysis result; (b) perform a second error analysis of the at least partially erroneous speech recognition result to provide a second error analysis result; and (c) determine at least one category of recognition error associated with the at least partially erroneous speech recognition result based on a combination of the first error analysis result and the second error analysis result.

    摘要翻译: 描述用于诊断语音识别错误的技术和技术。 在示例实现中,用于诊断语音识别错误的系统可以包括被配置为确定语音识别结果是最小部分错误的错误检测模块,以及识别错误诊断模块。 识别错误诊断模块可以被配置为(a)对所述至少部分错误的语音识别结果执行第一误差分析以提供第一误差分析结果; (b)对所述至少部分错误的语音识别结果进行第二误差分析以提供第二误差分析结果; 以及(c)基于所述第一误差分析结果和所述第二误差分析结果的组合来确定与所述至少部分错误的语音识别结果相关联的至少一类识别错误。