Deep learning driven multi-channel filtering for speech enhancement

发明授权

US10546593B2 Deep learning driven multi-channel filtering for speech enhancement 有权

请登陆查看更多内容

专利标题： Deep learning driven multi-channel filtering for speech enhancement
申请号： US15830955

申请日： 2017-12-04
公开(公告)号： US10546593B2

公开(公告)日： 2020-01-28
发明人: Jason Wung , Mehrez Souden , Ramin Pishehvar , Joshua D. Atkins
申请人： Apple Inc.
申请人地址： US CA Cupertino
专利权人： APPLE INC.
当前专利权人： APPLE INC.
当前专利权人地址： US CA Cupertino
代理机构： Womble Bond Dickinson (US) LLP
主分类号： G10L21/00
IPC分类号： G10L21/00 ; G10L19/00 ; G10L21/02 ; G10L15/02 ; G10L21/0232 ; G10L25/30 ; H04R1/40 ; G10L25/03 ; G10L21/0208

Deep learning driven multi-channel filtering for speech enhancement

摘要：

A number of features are extracted from a current frame of a multi-channel speech pickup and from side information that is a linear echo estimate, a diffuse signal component, or a noise estimate of the multi-channel speech pickup. A DNN-based speech presence probability is produced for the current frame, where the SPP value is produced in response to the extracted features being input to the DNN. The DNN-based SPP value is applied to configure a multi-channel filter whose input is the multi-channel speech pickup and whose output is a single audio signal. In one aspect, the system is designed to run online, at low enough latency for real time applications such voice trigger detection. Other aspects are also described and claimed.

公开/授权文献

US20190172476A1 DEEP LEARNING DRIVEN MULTI-CHANNEL FILTERING FOR SPEECH ENHANCEMENT 公开/授权日：2019-06-06

信息查询

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L21/00	为了改变语音或声音信号的质量或其可识度而处理语音或声音信号，以产生另一种可听的或非可听的信号，例如视觉信号或触觉信号（G10L19/00优先）