Augmented multi-tier classifier for multi-modal voice activity detection

发明授权

US09892745B2 Augmented multi-tier classifier for multi-modal voice activity detection 有权

请登陆查看更多内容

专利标题： Augmented multi-tier classifier for multi-modal voice activity detection
申请号： US13974453

申请日： 2013-08-23
公开(公告)号： US09892745B2

公开(公告)日： 2018-02-13
发明人: Dimitrios Dimitriadis , Eric Zavesky , Matthew Burlick
申请人： AT&T Intellectual Property I, L.P.
申请人地址： US GA Atlanta
专利权人： AT&T Intellectual Property I, L.P.
当前专利权人： AT&T Intellectual Property I, L.P.
当前专利权人地址： US GA Atlanta
主分类号： G10L15/24
IPC分类号： G10L15/24 ; G10L25/78 ; G10L25/84 ; G06K9/00

Augmented multi-tier classifier for multi-modal voice activity detection

摘要：

Disclosed herein are systems, methods, and computer-readable storage media for detecting voice activity in a media signal in an augmented, multi-tier classifier architecture. A system configured to practice the method can receive, from a first classifier, a first voice activity indicator detected in a first modality for a human subject. Then, the system can receive, from a second classifier, a second voice activity indicator detected in a second modality for the human subject, wherein the first voice activity indicator and the second voice activity indicators are based on the human subject at a same time, and wherein the first modality and the second modality are different. The system can concatenate, via a third classifier, the first voice activity indicator and the second voice activity indicator with original features of the human subject, to yield a classifier output, and determine voice activity based on the classifier output.

公开/授权文献

US20150058004A1 AUGMENTED MULTI-TIER CLASSIFIER FOR MULTI-MODAL VOICE ACTIVITY DETECTION 公开/授权日：2015-02-26

信息查询

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/24	.利用非声学特征的语音识别