基于人工智能的语音特征提取方法及装置

发明授权

请登陆查看更多内容

专利标题： 基于人工智能的语音特征提取方法及装置
申请号： CN201611239071.2

申请日： 2016-12-28
公开(公告)号： CN106710589B

公开(公告)日： 2019-07-30
发明人: 李超 , 李先刚
申请人： 百度在线网络技术(北京)有限公司
申请人地址： 北京市海淀区上地十街10号百度大厦三层
专利权人： 百度在线网络技术(北京)有限公司
当前专利权人： 百度在线网络技术(北京)有限公司
当前专利权人地址： 北京市海淀区上地十街10号百度大厦三层
代理机构： 北京清亦华知识产权代理事务所
代理商 宋合成
主分类号： G10L15/02
IPC分类号： G10L15/02 ; G10L25/18

摘要：

本发明提出一种基于人工智能的语音特征提取方法及装置，其中，方法包括：对待识别语音进行频谱分析，得到待识别语音的语谱图，利用图像识别算法中的Inception卷积结构，对语谱图进行特征提取，得到待识别语音的语音特征。本发明中，通过对待识别语音进行频谱分析，将连续的待识别语音转换成语谱图进行表示，由于Inception卷积结构为可以精准识别图像特征的有效的图像识别方式，利用Inception卷积结构对语谱图进行识别，提取出待识别语音较为准确的语音特征，进而可以提高语音识别的准确率。

摘要（英）：

Embodiments of the present disclosure provide a method and a device for extracting a speech feature based on artificial intelligence. The method includes performing a spectrum analysis on a speech to be recognized, to obtain a spectrum program of the speech; and extracting features of the spectrum program by using an Inception convolution structure of an image recognition algorithm, to obtain the speech feature of the speech. In embodiments, by performing the spectrum analysis on the speech to be recognized, the consecutive speech to be recognized is converted into the spectrum diagram. As the Inception convolution structure is an effective image recognition manner being able to accurately recognize features of an image, the spectrum program is recognized with the Inception convolution structure to extract the relative accurate speech feature from the speech to be recognized Thus, the accuracy rate of the speech recognition is improved.

公开/授权文献

CN106710589A 基于人工智能的语音特征提取方法及装置公开/授权日：2017-05-24

信息查询

中国专利公布公告 Global Dossier Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/02	.语音识别的特征提取；识别单位的选择