Systems and methods for speech recognition
    11.
    发明授权
    Systems and methods for speech recognition 有权
    用于语音识别的系统和方法

    公开(公告)号:US09558741B2

    公开(公告)日:2017-01-31

    申请号:US14291138

    申请日:2014-05-30

    CPC classification number: G10L15/083 G10L15/1815 G10L15/183

    Abstract: Systems and methods are provided for speech recognition. For example, audio characteristics are extracted from acquired voice signals; a syllable confusion network is identified based on at least information associated with the audio characteristics; a word lattice is generated based on at least information associated with the syllable confusion network and a predetermined phonetic dictionary; and an optimal character sequence is calculated in the word lattice as a speech recognition result.

    Abstract translation: 提供了语音识别的系统和方法。 例如,从获取的语音信号中提取音频特性; 至少基于与音频特征相关联的信息来识别音节混淆网络; 基于至少与音节混淆网络和预定语音字典相关联的信息生成单词格点; 并且在单词格中计算出最佳字符序列作为语音识别结果。

    Method and device for parallel processing in model training
    12.
    发明授权
    Method and device for parallel processing in model training 有权
    模型训练中并行处理的方法与装置

    公开(公告)号:US09508347B2

    公开(公告)日:2016-11-29

    申请号:US14108237

    申请日:2013-12-16

    CPC classification number: G10L15/34 G06N3/02 G10L15/063 G10L15/16

    Abstract: A method and a device for training a DNN model includes: at a device including one or more processors and memory: establishing an initial DNN model; dividing a training data corpus into a plurality of disjoint data subsets; for each of the plurality of disjoint data subsets, providing the data subset to a respective training processing unit of a plurality of training processing units operating in parallel, wherein the respective training processing unit applies a Stochastic Gradient Descent (SGD) process to update the initial DNN model to generate a respective DNN sub-model based on the data subset; and merging the respective DNN sub-models generated by the plurality of training processing units to obtain an intermediate DNN model, wherein the intermediate DNN model is established as either the initial DNN model for a next training iteration or a final DNN model in accordance with a preset convergence condition.

    Abstract translation: 用于训练DNN模型的方法和设备包括:在包括一个或多个处理器和存储器的设备上:建立初始DNN模型; 将训练数据语料库划分为多个不相交的数据子集; 对于多个不相交数据子集中的每一个,将数据子集提供给并行操作的多个训练处理单元的相应训练处理单元,其中各训练处理单元应用随机梯度下降(SGD)过程来更新初始 DNN模型基于数据子集生成相应的DNN子模型; 并且合并由多个训练处理单元生成的各个DNN子模型,以获得中间DNN模型,其中中间DNN模型被建立为用于下一个训练迭代的初始DNN模型或根据下面的训练迭代的最终DNN模型 预设收敛条件。

    Method and system for automatic speech recognition
    13.
    发明授权
    Method and system for automatic speech recognition 有权
    自动语音识别的方法和系统

    公开(公告)号:US09472190B2

    公开(公告)日:2016-10-18

    申请号:US14263958

    申请日:2014-04-28

    CPC classification number: G10L15/193 G10L15/083

    Abstract: A method of recognizing speech is provided that includes generating a decoding network that includes a primary sub-network and a classification sub-network. The primary sub-network includes a classification node corresponding to the classification sub-network. The classification sub-network corresponds to a group of uncommon words. A speech input is received and decoded by instantiating a token in the primary sub-network and passing the token through the primary network. When the token reaches the classification node, the method includes transferring the token to the classification sub-network and passing the token through the classification sub-network. When the token reaches an accept node of the classification sub-network, the method includes returning a result of the token passing through the classification sub-network to the primary sub-network. The result includes one or more words in the group of uncommon words. A string corresponding to the speech input is output that includes the one or more words.

    Abstract translation: 提供一种识别语音的方法,其包括生成包括主子网络和分类子网络的解码网络。 主子网包括与分类子网对应的分类节点。 分类子网对应于一组不常见的单词。 通过在主子网络中实例化令牌并传递令牌通过主网络来接收和解码语音输入。 当令牌到达分类节点时,该方法包括将令牌传送到分类子网,并通过分类子网传递令牌。 当令牌到达分类子网络的接受节点时,该方法包括将通过分类子网络的令牌的结果返回到主子网络。 结果包括不常见词组中的一个或多个单词。 输出对应于语音输入的字符串,其包括一个或多个单词。

    Method and apparatus for building a language model
    14.
    发明授权
    Method and apparatus for building a language model 有权
    构建语言模型的方法和装置

    公开(公告)号:US09396724B2

    公开(公告)日:2016-07-19

    申请号:US14181263

    申请日:2014-02-14

    CPC classification number: G10L15/063 G10L15/183 G10L15/197

    Abstract: A method includes: acquiring data samples; performing categorized sentence mining in the acquired data samples to obtain categorized training samples for multiple categories; building a text classifier based on the categorized training samples; classifying the data samples using the text classifier to obtain a class vocabulary and a corpus for each category; mining the corpus for each category according to the class vocabulary for the category to obtain a respective set of high-frequency language templates; training on the templates for each category to obtain a template-based language model for the category; training on the corpus for each category to obtain a class-based language model for the category; training on the class vocabulary for each category to obtain a lexicon-based language model for the category; building a speech decoder according to an acoustic model, the class-based language model and the lexicon-based language model for any given field, and the data samples.

    Abstract translation: 一种方法包括:获取数据样本; 在获取的数据样本中执行分类句子挖掘以获得用于多个类别的分类训练样本; 基于分类训练样本构建文本分类器; 使用文本分类器对数据样本进行分类,以获得每个类别的类词汇和语料库; 根据类别的词汇量挖掘每个类别的语料库,以获得相应的一组高频语言模板; 对每个类别的模板进行培训,以获取该类别的基于模板的语言模型; 对每个类别的语料库进行训练,以获得该类别的基于类的语言模型; 对每个类别的课堂词汇进行培训,以获得该类别的基于词典的语言模型; 根据声学模型,基于类的语言模型和任何给定字段的基于词典的语言模型构建语音解码器,以及数据样本。

    Method and device for audio recognition
    15.
    发明授权
    Method and device for audio recognition 有权
    音频识别方法和设备

    公开(公告)号:US09373336B2

    公开(公告)日:2016-06-21

    申请号:US14103753

    申请日:2013-12-11

    Abstract: A method and device for performing audio recognition, including: collecting a first audio document to be recognized; initiating calculation of first characteristic information of the first audio document, including: conducting time-frequency analysis for the first audio document to generate a first preset number of phase channels; and extracting at least one peak value characteristic point from each phase channel of the first preset number of phrase channels, where the at least one peak value characteristic point of each phase channel constitutes the peak value characteristic point sequence of said each phase channel; and obtaining a recognition result for the first audio document, wherein the recognition result is identified based on the first characteristic information, and wherein the first characteristic information is calculated based on the respective peak value characteristic point sequences of the preset number of phase channels.

    Abstract translation: 一种用于执行音频识别的方法和装置,包括:收集要识别的第一音频文档; 开始计算第一音频文档的第一特征信息,包括:对第一音频文档进行时间 - 频率分析以产生第一预设数量的相位通道; 以及从所述第一预设数量的短语通道的每个相位通道提取至少一个峰值特征点,其中每个相位通道的至少一个峰值特征点构成所述每个相位通道的峰值特征点序列; 并且获得第一音频文档的识别结果,其中基于第一特征信息识别识别结果,并且其中基于预设数量的相位通道的相应峰值特征点序列来计算第一特征信息。

    Language recognition based on vocabulary lists
    16.
    发明授权
    Language recognition based on vocabulary lists 有权
    基于词汇表的语言识别

    公开(公告)号:US09336197B2

    公开(公告)日:2016-05-10

    申请号:US14108224

    申请日:2013-12-16

    CPC classification number: G06F17/2735

    Abstract: A method is implemented at a computer to determine that certain information content is composed or compiled in a specific language selected among two or more similar languages. The computer integrates a first vocabulary list of a first language and a second vocabulary list of a second language into a comprehensive vocabulary list. The integrating includes analyzing the first vocabulary list in view of the second vocabulary list to identify a first vocabulary sub-list that is used in the first language, but not in the second language. The computer then identifies, in the information content, a plurality of expressions that are included in the comprehensive vocabulary list, and a subset of expressions that are included in the first vocabulary sub-list. Upon a determination that a total frequency of occurrence of the subset of expressions meets predetermined occurrence criteria, the computer determines that the information content is composed in the first language.

    Abstract translation: 在计算机上实现一种方法来确定某些信息内容是以两种或多种类似语言中选择的特定语言来组合或编译的。 计算机将第一语言的第一词汇列表和第二语言的第二词汇列表集成到综合词汇列表中。 该集成包括根据第二词汇列表分析第一词汇列表以识别在第一语言中使用的第一词汇子列表,而不是第二语言。 然后,计算机在信息内容中识别包括在综合词汇列表中的多个表达式以及包括在第一词汇子列表中的表达式的子集。 在确定表达子集的总出现频率满足预定出现标准的情况下,计算机确定信息内容以第一语言组成。

    INVITATION BEHAVIOR PREDICTION METHOD AND APPARATUS, AND STORAGE MEDIUM

    公开(公告)号:US20190244115A1

    公开(公告)日:2019-08-08

    申请号:US16387705

    申请日:2019-04-18

    CPC classification number: G06N5/02 G06F16/285 G06Q10/04 G06Q50/00

    Abstract: In a method for invitation behavior prediction, group behavior feature information of a first user of a group is obtained. In addition, group relationship feature information of a second user is obtained. Further, group architecture information of the group, the group behavior feature information of the first user, and the group relationship feature information of the second user are input to an invitation prediction model, to obtain a target member user and a candidate invitation user of the target member user. The invitation prediction model is obtained by training the invitation prediction model based on a plurality of sample groups in a training set, and group relationship feature information of associated users of member users in the plurality of sample groups. Invitation prediction information is sent to the target member user to prompt the target member user to add the candidate invitation user to the group.

    MODEL-BASED AUTOMATIC CORRECTION OF TYPOGRAPHICAL ERRORS

    公开(公告)号:US20190102373A1

    公开(公告)日:2019-04-04

    申请号:US16133440

    申请日:2018-09-17

    Abstract: A method is performed at a computer for automatically correcting typographical errors. The computer selects a target word in a target sentence and identifies a target word therein as having a typographical error and first and second sequences of words separated by the target word as context. After identifying, among a database of grammatically correct sentences, a set of sentences having the first and second sequences of words, each sentence including a replacement word, the computer selects a set of candidate grammatically correct sentences whose corresponding replacement words have similarities to the target word above a pre-set threshold, Finally, the computer chooses, among the set of candidate grammatically correct sentences, a fittest grammatically correct sentence according to a linguistic model and replaces the target word in the target sentence with the replacement word within the fittest grammatically correct sentence.

Patent Agency Ranking