- 专利标题: METHOD AND APPARATUS FOR SPEECH SOURCE SEPARATION BASED ON A CONVOLUTIONAL NEURAL NETWORK
-
申请号: US17611121申请日: 2020-05-13
-
公开(公告)号: US20220223144A1公开(公告)日: 2022-07-14
- 发明人: Jundai SUN , Zhiwei SHUANG , Lie LU , Shaofan YANG , Jia DAI
- 申请人: Dolby Laboratories Licensing Corporation
- 申请人地址: US CA San Francisco
- 专利权人: Dolby Laboratories Licensing Corporation
- 当前专利权人: Dolby Laboratories Licensing Corporation
- 当前专利权人地址: US CA San Francisco
- 优先权: CNPCT/CN2019/086769 20190514,EP19188010.3 20190724
- 国际申请: PCT/US2020/032762 WO 20200513
- 主分类号: G10L15/20
- IPC分类号: G10L15/20 ; G10L15/16 ; G10L15/22 ; G10L21/0308 ; G10L25/18 ; G06N3/08
摘要:
Described herein is a method for Convolutional Neural Network (CNN) based speech source separation, wherein the method includes the steps of: (a) providing multiple frames of a time-frequency transform of an original noisy speech signal; (b) inputting the time-frequency transform of said multiple frames into an aggregated multi-scale CNN having a plurality of parallel convolution paths; (c) extracting and outputting, by each parallel convolution path, features from the input time-frequency transform of said multiple frames; (d) obtaining an aggregated output of the outputs of the parallel convolution paths; and (e) generating an output mask for extracting speech from the original noisy speech signal based on the aggregated output. Described herein are further an apparatus for CNN based speech source separation as well as a respective computer program product comprising a computer-readable storage medium with instructions adapted to carry out said method when executed by a device having processing capability.
公开/授权文献
信息查询