Asynchronous optimization for sequence training of neural networks

    公开(公告)号:US10019985B2

    公开(公告)日:2018-07-10

    申请号:US14258139

    申请日:2014-04-22

    Applicant: Google Inc.

    CPC classification number: G10L15/063 G06N3/0454 G10L15/16 G10L15/183

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.

    Transfer learning for deep neural network based hotword detection

    公开(公告)号:US09715660B2

    公开(公告)日:2017-07-25

    申请号:US14230225

    申请日:2014-03-31

    Applicant: Google Inc.

    CPC classification number: G06N7/005 G06N3/0454 G10L15/16 G10L2015/088

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a deep neural network. One of the methods includes training a deep neural network with a first training set by adjusting values for each of a plurality of weights included in the neural network, and training the deep neural network to determine a probability that data received by the deep neural network has features similar to key features of one or more keywords or key phrases, the training comprising providing the deep neural network with a second training set and adjusting the values for a first subset of the plurality of weights, wherein the second training set includes data representing the key features of the one or more keywords or key phrases.

    Neural Networks For Speaker Verification
    3.
    发明申请
    Neural Networks For Speaker Verification 有权
    用于演讲者验证的神经网络

    公开(公告)号:US20170069327A1

    公开(公告)日:2017-03-09

    申请号:US14846187

    申请日:2015-09-04

    Applicant: Google Inc.

    CPC classification number: G10L17/18 G10L17/02 G10L17/04

    Abstract: This document generally describes systems, methods, devices, and other techniques related to speaker verification, including (i) training a neural network for a speaker verification model, (ii) enrolling users at a client device, and (iii) verifying identities of users based on characteristics of the users' voices. Some implementations include a computer-implemented method. The method can include receiving, at a computing device, data that characterizes an utterance of a user of the computing device. A speaker representation can be generated, at the computing device, for the utterance using a neural network on the computing device. The neural network can be trained based on a plurality of training samples that each: (i) include data that characterizes a first utterance and data that characterizes one or more second utterances, and (ii) are labeled as a matching speakers sample or a non-matching speakers sample.

    Abstract translation: 本文件通常描述与扬声器验证相关的系统,方法,设备和其他技术,包括(i)训练用于说话者验证模型的神经网络,(ii)在客户端设备上注册用户,以及(iii)验证用户的身份 基于用户声音的特点。 一些实现包括计算机实现的方法。 该方法可以包括在计算设备处接收表征计算设备的用户的话语的数据。 可以在计算设备处产生使用计算设备上的神经网络的话语的扬声器表示。 可以基于多个训练样本来训练神经网络,每个训练样本:(i)包括表征第一话语的数据和表征一个或多个第二话语的数据,以及(ii)被标记为匹配的说话者样本或非 匹配音箱样品。

    Speech recognition process
    4.
    发明授权
    Speech recognition process 有权
    语音识别过程

    公开(公告)号:US08775177B1

    公开(公告)日:2014-07-08

    申请号:US13665245

    申请日:2012-10-31

    Applicant: Google Inc.

    CPC classification number: G10L15/10 G10L2015/085

    Abstract: A speech recognition process may perform the following operations: performing a preliminary recognition process on first audio to identify candidates for the first audio; generating first templates corresponding to the first audio, where each first template includes a number of elements; selecting second templates corresponding to the candidates, where the second templates represent second audio, and where each second template includes elements that correspond to the elements in the first templates; comparing the first templates to the second templates, where comparing comprises includes similarity metrics between the first templates and corresponding second templates; applying weights to the similarity metrics to produce weighted similarity metrics, where the weights are associated with corresponding second templates; and using the weighted similarity metrics to determine whether the first audio corresponds to the second audio.

    Abstract translation: 语音识别处理可以执行以下操作:对第一音频执行初步识别处理以识别第一音频的候选; 生成与第一音频相对应的第一模板,其中每个第一模板包括多个元素; 选择与候选对应的第二模板,其中第二模板表示第二音频,并且其中每个第二模板包括与第一模板中的元素相对应的元素; 将第一模板与第二模板进行比较,其中比较包括第一模板与对应的第二模板之间的相似性度量; 对所述相似性度量应用权重以产生加权相似性度量,其中所述权重与相应的第二模板相关联; 以及使用所述加权相似性度量来确定所述第一音频是否对应于所述第二音频。

    Multilingual, acoustic deep neural networks
    5.
    发明授权
    Multilingual, acoustic deep neural networks 有权
    多语言,声学深层神经网络

    公开(公告)号:US09460711B1

    公开(公告)日:2016-10-04

    申请号:US13862541

    申请日:2013-04-15

    Applicant: Google Inc.

    CPC classification number: G10L15/16 G10L15/063 G10L15/144

    Abstract: Methods and systems for processing multilingual DNN acoustic models are described. An example method may include receiving training data that includes a respective training data set for each of two or more or languages. A multilingual deep neural network (DNN) acoustic model may be processed based on the training data. The multilingual DNN acoustic model may include a feedforward neural network having multiple layers of one or more nodes. Each node of a given layer may connect with a respective weight to each node of a subsequent layer, and the multiple layers of one or more nodes may include one or more shared hidden layers of nodes and a language-specific output layer of nodes corresponding to each of the two or more languages. Additionally, weights associated with the multiple layers of one or more nodes of the processed multilingual DNN acoustic model may be stored in a database.

    Abstract translation: 描述了处理多语言DNN声学模型的方法和系统。 示例性方法可以包括接收包括用于两种或多种或多种语言中的每一种的相应训练数据集的训练数据。 可以基于训练数据处理多语言深层神经网络(DNN)声学模型。 多语言DNN声学模型可以包括具有一个或多个节点的多个层的前馈神经网络。 给定层的每个节点可以将相应权重连接到后续层的每个节点,并且一个或多个节点的多个层可以包括节点的一个或多个共享隐藏层和对应于节点的语言特定输出层 每种两种或多种语言。 另外,与经处理的多语言DNN声学模型的一个或多个节点的多个层相关联的权重可以存储在数据库中。

    TRANSFER LEARNING FOR DEEP NEURAL NETWORK BASED HOTWORD DETECTION
    6.
    发明申请
    TRANSFER LEARNING FOR DEEP NEURAL NETWORK BASED HOTWORD DETECTION 有权
    基于深层神经网络的传输学习方法

    公开(公告)号:US20150127594A1

    公开(公告)日:2015-05-07

    申请号:US14230225

    申请日:2014-03-31

    Applicant: GOOGLE INC.

    CPC classification number: G06N7/005 G06N3/0454 G10L15/16 G10L2015/088

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a deep neural network. One of the methods includes training a deep neural network with a first training set by adjusting values for each of a plurality of weights included in the neural network, and training the deep neural network to determine a probability that data received by the deep neural network has features similar to key features of one or more keywords or key phrases, the training comprising providing the deep neural network with a second training set and adjusting the values for a first subset of the plurality of weights, wherein the second training set includes data representing the key features of the one or more keywords or key phrases.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于训练深层神经网络。 其中一种方法包括通过调整包含在神经网络中的多个权重中的每一个的值来训练具有第一训练集的深神经网络,以及训练深层神经网络以确定由深层神经网络接收的数据的概率 特征类似于一个或多个关键词或关键短语的关键特征,所述训练包括向所述深层神经网络提供第二训练集并且调整所述多个权重的第一子集的值,其中所述第二训练集包括表示 一个或多个关键字或关键短语的主要功能。

    ASYNCHRONOUS OPTIMIZATION FOR SEQUENCE TRAINING OF NEURAL NETWORKS
    7.
    发明申请
    ASYNCHRONOUS OPTIMIZATION FOR SEQUENCE TRAINING OF NEURAL NETWORKS 有权
    神经网络序列训练的异步优化

    公开(公告)号:US20150127337A1

    公开(公告)日:2015-05-07

    申请号:US14258139

    申请日:2014-04-22

    Applicant: Google Inc.

    CPC classification number: G10L15/063 G06N3/0454 G10L15/16 G10L15/183

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于通过第一序列训练语音模型获得表示第一训练话语的语音特征的第一批训练帧; 通过所述第一序列训练语音模型获得一个或多个第一神经网络参数; 基于(i)第一批训练帧和(ii)所述一个或多个第一神经网络参数,通过所述第一序列训练语音模型确定一个或多个优化的第一神经网络参数; 通过第二序列训练语音模型获得表示第二训练语音的语音特征的第二批训练帧; 获得一个或多个第二神经网络参数; 以及通过所述第二序列训练语音模型,基于(i)第二批训练帧和(ii)所述一个或多个第二神经网络参数来确定一个或多个优化的第二神经网络参数。

Patent Agency Ranking