USER-SPECIFIC ACOUSTIC MODELS
    11.
    发明申请

    公开(公告)号:US20210312931A1

    公开(公告)日:2021-10-07

    申请号:US17349758

    申请日:2021-06-16

    Applicant: Apple Inc.

    Abstract: Systems and processes for providing user-specific acoustic models are provided. In accordance with one example, a method includes, at an electronic device having one or more processors, receiving a plurality of speech inputs, each of the speech inputs associated with a same user of the electronic device; providing each of the plurality of speech inputs to a user-independent acoustic model, the user-independent acoustic model providing a plurality of speech results based on the plurality of speech inputs; initiating a user-specific acoustic model on the electronic device; and adjusting the user-specific acoustic model based on the plurality of speech inputs and the plurality of speech results.

    PRIVACY PRESERVING DISTRIBUTED EVALUATION FRAMEWORK FOR EMBEDDED PERSONALIZED SYSTEMS

    公开(公告)号:US20170352346A1

    公开(公告)日:2017-12-07

    申请号:US15266949

    申请日:2016-09-15

    Applicant: Apple Inc.

    Abstract: Systems and processes for evaluating embedded personalized systems are provided. In one example process, instructions that define an experiment associated with a personalized speech recognition system can be received. The instructions can define one or more experimental parameters. In accordance with the received instructions, a second personalized speech recognition system can be generated based on the personalized speech recognition system and the one or more experimental parameters. Additionally, the plurality of user speech samples can be processed using the second personalized speech recognition system to generate a plurality of speech recognition results and a plurality of accuracy scores corresponding to the plurality of speech recognition results. Second instructions can be received based on the plurality of accuracy scores. In accordance with the second instructions, the second speech recognition system can be activated.

    AUTOMATIC ACCENT DETECTION
    14.
    发明申请
    AUTOMATIC ACCENT DETECTION 审中-公开
    自动检测

    公开(公告)号:US20160358600A1

    公开(公告)日:2016-12-08

    申请号:US14846650

    申请日:2015-09-04

    Applicant: Apple Inc.

    Abstract: Systems and processes for automatic accent detection are provided. In accordance with one example, a method includes, at an electronic device with one or more processors and memory, receiving a user input, determining a first similarity between a representation of the user input and a first acoustic model of a plurality of acoustic models, and determining a second similarity between the representation of the user input and a second acoustic model of the plurality of acoustic models. The method further includes determining whether the first similarity is greater than the second similarity. In accordance with a determination that the first similarity is greater than the second similarity, the first acoustic model may be selected; and in accordance with a determination that the first similarity is not greater than the second similarity, the second acoustic model may be selected.

    Abstract translation: 提供了自动重音检测的系统和过程。 根据一个示例,一种方法包括在具有一个或多个处理器和存储器的电子设备处接收用户输入,确定用户输入的表示与多个声学模型的第一声学模型之间的第一相似度, 以及确定所述用户输入的表示与所述多个声学模型的第二声学模型之间的第二相似度。 该方法还包括确定第一相似度是否大于第二相似度。 根据第一相似度大于第二相似度的确定,可以选择第一声学模型; 并且根据第一相似度不大于第二相似度的确定,可以选择第二声学模型。

    METHOD AND APPARATUS FOR DISCOVERING TRENDING TERMS IN SPEECH REQUESTS
    15.
    发明申请
    METHOD AND APPARATUS FOR DISCOVERING TRENDING TERMS IN SPEECH REQUESTS 有权
    用于发现语音请求中的趋势条件的方法和装置

    公开(公告)号:US20160078860A1

    公开(公告)日:2016-03-17

    申请号:US14839835

    申请日:2015-08-28

    Applicant: Apple Inc.

    Abstract: Systems and processes are disclosed for discovering trending terms in automatic speech recognition. Candidate terms (e.g., words, phrases, etc.) not yet found in a speech recognizer vocabulary or having low language model probability can be identified based on trending usage in a variety of electronic data sources (e.g., social network feeds, news sources, search queries, etc.). When candidate terms are identified, archives of live or recent speech traffic can be searched to determine whether users are uttering the candidate terms in dictation or speech requests. Such searching can be done using open vocabulary spoken term detection to find phonetic matches in the audio archives. As the candidate terms are found in the speech traffic, notifications can be generated that identify the candidate terms, provide relevant usage statistics, identify the context in which the terms are used, and the like.

    Abstract translation: 公开了用于发现自动语音识别中的趋势术语的系统和过程。 可以基于各种电子数据源中的趋势使用(例如,社交网络馈送,新闻源,语音识别器词汇或新闻源)的语音识别器词汇或语言模型概率较低的候选词语(例如单词,短语等) 搜索查询等)。 当确定候选词时,可以搜索现场或最近语音流量的档案以确定用户是否在听写或语音请求中发出候选词。 这样的搜索可以使用开放词汇词汇检测来完成,以便在音频档案中找到语音匹配。 由于候选词在语音流中被发现,所以可以生成识别候选词的通知,提供相关的使用统计,识别使用术语的上下文等。

    EFFICIENT GENERATION OF COMPLEMENTARY ACOUSTIC MODELS FOR PERFORMING AUTOMATIC SPEECH RECOGNITION SYSTEM COMBINATION
    16.
    发明申请
    EFFICIENT GENERATION OF COMPLEMENTARY ACOUSTIC MODELS FOR PERFORMING AUTOMATIC SPEECH RECOGNITION SYSTEM COMBINATION 审中-公开
    用于执行自动语音识别系统组合的补充声音模型的有效生成

    公开(公告)号:US20160034811A1

    公开(公告)日:2016-02-04

    申请号:US14503028

    申请日:2014-09-30

    Applicant: Apple Inc.

    CPC classification number: G06N3/0454 G06N3/0472 G10L15/16

    Abstract: Systems and processes for generating complementary acoustic models for performing automatic speech recognition system combination are provided. In one example process, a deep neural network can be trained using a set of training data. The trained deep neural network can be a deep neural network acoustic model. A Gaussian-mixture model can be linked to a hidden layer of the trained deep neural network such that any feature vector outputted from the hidden layer is received by the Gaussian-mixture model. The Gaussian-mixture model can be trained via a first portion of the trained deep neural network and using the set of training data. The first portion of the trained deep neural network can include an input layer of the deep neural network and the hidden layer. The first portion of the trained deep neural network and the trained Gaussian-mixture model can be a Deep Neural Network-Gaussian-Mixture Model (DNN-GMM) acoustic model.

    Abstract translation: 提供了用于产生用于执行自动语音识别系统组合的互补声学模型的系统和过程。 在一个示例过程中,可以使用一组训练数据来训练深层神经网络。 训练有素的深层神经网络可以是深层神经网络声学模型。 高斯混合模型可以连接到经过训练的深层神经网络的隐层,使得从隐层输出的任何特征向量都被高斯混合模型接收。 高斯混合模型可以通过经训练的深层神经网络的第一部分进行训练并使用该组训练数据。 训练深的神经网络的第一部分可以包括深层神经网络和隐层的输入层。 经训练的深神经网络和经训练的高斯混合模型的第一部分可以是深神经网络 - 高斯混合模型(DNN-GMM)声学模型。

    METHOD FOR SUPPORTING DYNAMIC GRAMMARS IN WFST-BASED ASR
    17.
    发明申请
    METHOD FOR SUPPORTING DYNAMIC GRAMMARS IN WFST-BASED ASR 有权
    在基于WFST的ASR中支持动态GRAMMARS的方法

    公开(公告)号:US20150348547A1

    公开(公告)日:2015-12-03

    申请号:US14494305

    申请日:2014-09-23

    Applicant: Apple Inc.

    Abstract: Systems and processes are disclosed for recognizing speech using a weighted finite state transducer (WFST) approach. Dynamic grammars can be supported by constructing the final recognition cascade during runtime using difference grammars. In a first grammar, non-terminals can be replaced with a, weighted phone loop that produces sequences of mono-phone words. In a second grammar, at runtime, non-terminals can be replaced with sub-grammars derived from user-specific usage data including contact, media, and application lists. Interaction frequencies associated with these entities can be used to weight certain words over others. With all non-terminals replaced, a static recognition cascade with the first grammar can be composed with the personalized second grammar to produce a user-specific WEST. User speech can then be processed to generate candidate words having associated probabilities, and the likeliest result can be output.

    Abstract translation: 公开了使用加权有限状态传感器(WFST)方法识别语音的系统和过程。 动态语法可以通过在运行时使用差异语法构建最终识别级联来支持。 在第一种语法中,非终端可以被替换为产生单声道单词序列的加权电话环路。 在第二语法中,在运行时,非终端可以由源于用户特定使用数据(包括联系人,媒体和应用程序列表)的子语法替代。 与这些实体相关联的相互作用频率可以用于对某些单词进行加权。 随着所有非终端的替换,与第一语法的静态识别级联可以用个性化的第二语法组成,以产生用户特定的WEST。 然后可以处理用户语音以产生具有相关概率的候选词,并且可以输出最有可能的结果。

Patent Agency Ranking