-
公开(公告)号:US09940935B2
公开(公告)日:2018-04-10
申请号:US15240696
申请日:2016-08-18
Inventor: Eryu Wang , Li Lu , Xiang Zhang , Haibo Liu , Lou Li , Feng Rao , Duling Lu , Shuai Yue , Bo Chen
Abstract: A method is performed at a device having one or more processors and memory. The device establishes a first-level Deep Neural Network (DNN) model based on unlabeled speech data, the unlabeled speech data containing no speaker labels and the first-level DNN model specifying a plurality of basic voiceprint features for the unlabeled speech data. The device establishes a second-level DNN model by tuning the first-level DNN model based on labeled speech data, the labeled speech data containing speech samples with respective speaker labels, wherein the second-level DNN model specifies a plurality of high-level voiceprint features. Using the second-level DNN model, registers a first high-level voiceprint feature sequence for a user based on a registration speech sample received from the user. The device performs speaker verification for the user based on the first high-level voiceprint feature sequence registered for the user.
-
公开(公告)号:US09779728B2
公开(公告)日:2017-10-03
申请号:US14160808
申请日:2014-01-22
IPC: G10L15/00 , G10L15/04 , G10L15/18 , G10L15/26 , G10L15/187 , G06F17/27 , G10L15/183
CPC classification number: G10L15/1815 , G06F17/27 , G06F17/2725 , G10L15/04 , G10L15/183 , G10L15/187 , G10L15/26 , G10L15/265
Abstract: Systems and methods are provided for adding punctuations. For example, one or more first feature units are identified in a voice file taken as a whole; the voice file is divided into multiple segments by detecting silences in the voice file; one or more second feature units are identified in the voice file; a first aggregate weight of first punctuation states of the voice file and a second aggregate weight of second punctuation states of the voice file are determined, using a language model established based on word separation and third semantic features; a weighted calculation is performed to generate a third aggregate weight based on a linear combination associated with the first aggregate weight and the second aggregate weight; and one or more final punctuations are added to the voice file based on at least information associated with the third aggregate weight.
-
公开(公告)号:US20140350939A1
公开(公告)日:2014-11-27
申请号:US14160808
申请日:2014-01-22
CPC classification number: G10L15/1815 , G06F17/27 , G06F17/2725 , G10L15/04 , G10L15/183 , G10L15/187 , G10L15/26 , G10L15/265
Abstract: Systems and methods are provided for adding punctuations. For example, one or more first feature units are identified in a voice file taken as a whole; the voice file is divided into multiple segments: one or more second feature units are identified in the voice file; a first aggregate weight of first punctuation states of the voice file and a second aggregate weight of second punctuation states of the voice file are determined, using a language model established based on word separation and third semantic features; a weighted calculation is performed to generate a third aggregate weight based on at least information associated with the first aggregate weight and the second aggregate weight; and one or more final punctuations are added to the voice file based on at least information associated with the third aggregate weight.
Abstract translation: 提供了系统和方法来添加标点符号。 例如,一个或多个第一特征单元在作为整体而言的语音文件中被识别; 语音文件被分成多个段:在语音文件中识别一个或多个第二特征单元; 使用基于词分离和第三语义特征建立的语言模型来确定语音文件的第一标点状态的第一聚合权重和语音文件的第二标点状态的第二聚合权重; 基于至少与第一聚集权重和第二聚集权重相关联的信息来执行加权计算以产生第三聚集权重; 并且基于至少与第三聚合权重相关联的信息将一个或多个最终标点符号添加到语音文件。
-
公开(公告)号:US20140350934A1
公开(公告)日:2014-11-27
申请号:US14291138
申请日:2014-05-30
Inventor: Lou Li , Li Lu , Xiang Zhang , Feng Rao , Shuai Yue , Bo Chen , Jianxiong Ma , Haibo Liu
IPC: G10L17/22
CPC classification number: G10L15/083 , G10L15/1815 , G10L15/183
Abstract: Systems and methods are provided for voice identification. For example, audio characteristics are extracted from acquired voice signals; a syllable confusion network is identified based on at least information associated with the audio characteristics; a word lattice is generated based on at least information associated with the syllable confusion network and a predetermined phonetic dictionary; and an optimal character sequence is calculated in the word lattice as an identification result.
Abstract translation: 为语音识别提供了系统和方法。 例如,从获取的语音信号中提取音频特性; 至少基于与音频特征相关联的信息来识别音节混淆网络; 基于至少与音节混淆网络和预定语音字典相关联的信息生成单词格点; 并且在字格中计算最佳字符序列作为识别结果。
-
公开(公告)号:US10013985B2
公开(公告)日:2018-07-03
申请号:US14958606
申请日:2015-12-03
Inventor: Shuai Yue , Xiang Zhang , Li Lu , Feng Rao , Eryu Wang , Haibo Liu , Bo Chen , Jian Liu , Lu Li
CPC classification number: G10L17/24 , G06F3/167 , G10L15/22 , G10L17/02 , G10L17/16 , G10L17/26 , G10L2015/223
Abstract: The present application discloses a method, an electronic system and a non-transitory computer readable storage medium for recognizing audio commands in an electronic device. The electronic device obtains audio data based on an audio signal provided by a user and extracts characteristic audio fingerprint features from the audio data. The electronic device further determines whether the corresponding audio signal is generated by an authorized user by comparing the characteristic audio fingerprint features with an audio fingerprint model for the authorized user and with a universal background model that represents user-independent audio fingerprint features, respectively. When the corresponding audio signal is generated by the authorized user of the electronic device, an audio command is extracted from the audio data, and an operation is performed according to the audio command.
-
公开(公告)号:US09558741B2
公开(公告)日:2017-01-31
申请号:US14291138
申请日:2014-05-30
Inventor: Lou Li , Li Lu , Xiang Zhang , Feng Rao , Shuai Yue , Bo Chen , Jianxiong Ma , Haibo Liu
IPC: G10L15/28 , G10L15/08 , G10L15/18 , G10L15/183
CPC classification number: G10L15/083 , G10L15/1815 , G10L15/183
Abstract: Systems and methods are provided for speech recognition. For example, audio characteristics are extracted from acquired voice signals; a syllable confusion network is identified based on at least information associated with the audio characteristics; a word lattice is generated based on at least information associated with the syllable confusion network and a predetermined phonetic dictionary; and an optimal character sequence is calculated in the word lattice as a speech recognition result.
Abstract translation: 提供了语音识别的系统和方法。 例如,从获取的语音信号中提取音频特性; 至少基于与音频特征相关联的信息来识别音节混淆网络; 基于至少与音节混淆网络和预定语音字典相关联的信息生成单词格点; 并且在单词格中计算出最佳字符序列作为语音识别结果。
-
7.
公开(公告)号:US09508347B2
公开(公告)日:2016-11-29
申请号:US14108237
申请日:2013-12-16
CPC classification number: G10L15/34 , G06N3/02 , G10L15/063 , G10L15/16
Abstract: A method and a device for training a DNN model includes: at a device including one or more processors and memory: establishing an initial DNN model; dividing a training data corpus into a plurality of disjoint data subsets; for each of the plurality of disjoint data subsets, providing the data subset to a respective training processing unit of a plurality of training processing units operating in parallel, wherein the respective training processing unit applies a Stochastic Gradient Descent (SGD) process to update the initial DNN model to generate a respective DNN sub-model based on the data subset; and merging the respective DNN sub-models generated by the plurality of training processing units to obtain an intermediate DNN model, wherein the intermediate DNN model is established as either the initial DNN model for a next training iteration or a final DNN model in accordance with a preset convergence condition.
Abstract translation: 用于训练DNN模型的方法和设备包括:在包括一个或多个处理器和存储器的设备上:建立初始DNN模型; 将训练数据语料库划分为多个不相交的数据子集; 对于多个不相交数据子集中的每一个,将数据子集提供给并行操作的多个训练处理单元的相应训练处理单元,其中各训练处理单元应用随机梯度下降(SGD)过程来更新初始 DNN模型基于数据子集生成相应的DNN子模型; 并且合并由多个训练处理单元生成的各个DNN子模型,以获得中间DNN模型,其中中间DNN模型被建立为用于下一个训练迭代的初始DNN模型或根据下面的训练迭代的最终DNN模型 预设收敛条件。
-
公开(公告)号:US10699059B2
公开(公告)日:2020-06-30
申请号:US15184552
申请日:2016-06-16
Inventor: Haibo Liu
IPC: G06F40/109 , G06F16/11 , G06F40/126
Abstract: The present disclosure is applicable to the communications field and provides a character updating method and apparatus. The method includes: receiving a character update request sent by a client, the character update request carrying a unicode of a character; searching for a file of which a file name is same with the unicode of the character, the file being configured to store single character data, the single character data being obtained by resolving character data stored in a font into single character data; and sending the found file to the client, so that the client updates, according to the character data in the received file, a corresponding character.
-
公开(公告)号:US09811517B2
公开(公告)日:2017-11-07
申请号:US14148579
申请日:2014-01-06
Inventor: Haibo Liu , Eryu Wang , Xiang Zhang , Li Lu , Shuai Yue , Qiuge Liu , Bo Chen , Jian Liu , Lu Li
CPC classification number: G06F17/273 , G06F17/2775 , G06F17/2785 , G06F17/289 , G10L15/265
Abstract: A method of processing information content based on a Chinese language model is performed at a computer, the method including: identifying a plurality of expressions in the information content extracted from a speech input through speech recognition that is queued to be processed; dividing the expressions into a plurality of characteristic units according to semantic features and predetermined characteristics associated with each characteristic unit, each including a subset of the expressions and the predetermined characteristics at least including a respective integer number of expressions that are included in the characteristic unit; extracting, from the Chinese language model, a plurality of probabilities for punctuation marks associated with each characteristic unit; and in accordance with the probabilities, associating a respective punctuation mark with each characteristic unit included in the information content. The method further comprises adding punctuation marks based on a weight determined for each punctuation mark.
-
公开(公告)号:US20160299877A1
公开(公告)日:2016-10-13
申请号:US15184552
申请日:2016-06-16
Inventor: Haibo Liu
Abstract: The present disclosure is applicable to the communications field and provides a character updating method and apparatus. The method includes: receiving a character update request sent by a client, the character update request carrying a unicode of a character; searching for a file of which a file name is same with the unicode of the character, the file being configured to store single character data, the single character data being obtained by resolving character data stored in a font into single character data; and sending the found file to the client, so that the client updates, according to the character data in the received file, a corresponding character.
Abstract translation: 本公开可应用于通信领域并提供字符更新方法和装置。 该方法包括:接收由客户端发送的字符更新请求,该字符更新请求携带一个字符的unicode; 搜索文件名与文件的unicode相同的文件,该文件被配置为存储单个字符数据,通过将存储在字体中的字符数据解析为单个字符数据而获得的单个字符数据; 并将发现的文件发送给客户端,使得客户端根据接收到的文件中的字符数据更新相应的字符。
-
-
-
-
-
-
-
-
-