Applying neural network language models to weighted finite state transducers for automatic speech recognition

    公开(公告)号:US10049668B2

    公开(公告)日:2018-08-14

    申请号:US15156161

    申请日:2016-05-16

    Applicant: Apple Inc.

    Abstract: Systems and processes for converting speech-to-text are provided. In one example process, speech input can be received. A sequence of states and arcs of a weighted finite state transducer (WFST) can be traversed. A negating finite state transducer (FST) can be traversed. A virtual FST can be composed using a neural network language model and based on the sequence of states and arcs of the WFST. The one or more virtual states of the virtual FST can be traversed to determine a probability of a candidate word given one or more history candidate words. Text corresponding to the speech input can be determined based on the probability of the candidate word given the one or more history candidate words. An output can be provided based on the text corresponding to the speech input.

    Applying neural network language models to weighted finite state transducers for automatic speech recognition

    公开(公告)号:US10354652B2

    公开(公告)日:2019-07-16

    申请号:US16035513

    申请日:2018-07-13

    Applicant: Apple Inc.

    Abstract: Systems and processes for converting speech-to-text are provided. In one example process, speech input can be received. A sequence of states and arcs of a weighted finite state transducer (WFST) can be traversed. A negating finite state transducer (FST) can be traversed. A virtual FST can be composed using a neural network language model and based on the sequence of states and arcs of the WFST. The one or more virtual states of the virtual FST can be traversed to determine a probability of a candidate word given one or more history candidate words. Text corresponding to the speech input can be determined based on the probability of the candidate word given the one or more history candidate words. An output can be provided based on the text corresponding to the speech input.

    Method for supporting dynamic grammars in WFST-based ASR
    3.
    发明授权
    Method for supporting dynamic grammars in WFST-based ASR 有权
    在基于WFST的ASR中支持动态语法的方法

    公开(公告)号:US09502031B2

    公开(公告)日:2016-11-22

    申请号:US14494305

    申请日:2014-09-23

    Applicant: Apple Inc.

    Abstract: Systems and processes are disclosed for recognizing speech using a weighted finite state transducer (WFST) approach. Dynamic grammars can be supported by constructing the final recognition cascade during runtime using difference grammars. In a first grammar, non-terminals can be replaced with a, weighted phone loop that produces sequences of mono-phone words. In a second grammar, at runtime, non-terminals can be replaced with sub-grammars derived from user-specific usage data including contact, media, and application lists. Interaction frequencies associated with these entities can be used to weight certain words over others. With all non-terminals replaced, a static recognition cascade with the first grammar can be composed with the personalized second grammar to produce a user-specific WEST. User speech can then be processed to generate candidate words having associated probabilities, and the likeliest result can be output.

    Abstract translation: 公开了使用加权有限状态传感器(WFST)方法识别语音的系统和过程。 动态语法可以通过在运行时使用差异语法构建最终识别级联来支持。 在第一种语法中,非终端可以被替换为产生单声道单词序列的加权电话环路。 在第二语法中,在运行时,非终端可以由源于用户特定使用数据(包括联系人,媒体和应用程序列表)的子语法替代。 与这些实体相关联的相互作用频率可以用于对某些单词进行加权。 随着所有非终端的替换,与第一语法的静态识别级联可以用个性化的第二语法组成,以产生用户特定的WEST。 然后可以处理用户语音以产生具有相关概率的候选词,并且可以输出最有可能的结果。

Patent Agency Ranking