Sound source direction estimation device, sound source direction estimation method, and program

    公开(公告)号:US11158334B2

    公开(公告)日:2021-10-26

    申请号:US16982954

    申请日:2019-01-28

    申请人: SONY CORPORATION

    发明人: Atsuo Hiroe

    IPC分类号: G10L25/51 H04R1/40

    摘要: In a case where two microphones are used, sound source direction estimation of a plurality of sound sources can be performed with high accuracy. For this purpose, an inter-microphone phase difference is calculated for every frequency band in a microphone pair including two microphones that are installed apart from each other by a predetermined distance. Furthermore, for every frequency band in the microphone pair, a single sound source mask indicating whether or not a component of the frequency band is a single sound source is calculated. Then, the calculated inter-microphone phase difference and the calculated single sound source mask are input as feature quantities to a multi-label classifier, and a direction label associated with a sound source direction is output to the feature quantities.

    Voice segment detection for extraction of sound source

    公开(公告)号:US10475440B2

    公开(公告)日:2019-11-12

    申请号:US14766246

    申请日:2013-12-20

    申请人: SONY CORPORATION

    发明人: Atsuo Hiroe

    摘要: There is provided an apparatus and a method for rapidly extracting a target sound from a sound signal where a variety of sounds are mixed generated from a plurality of the sound sources. There is a voice recognition unit including a tracking unit for detecting a sound source direction and a voice segment to execute a sound source extraction process, and a voice recognition unit for inputting a sound source extraction result to execute a voice recognition process. In the tracking unit, a segment being created management unit that creates and manages a voice segment per unit of sound source sequentially detects a sound source direction, sequentially updates a voice segment estimated by connecting a detection result to a time direction, creates an extraction filter for a sound source extraction after a predetermined time is elapsed, and sequentially creates a sound source extraction result by sequentially applying the extraction filter to an input voice signal. The voice recognition unit sequentially executes the voice recognition process to a partial sound source extraction result to output a voice recognition result.

    Sound signal processing device and sound signal processing method

    公开(公告)号:US10013998B2

    公开(公告)日:2018-07-03

    申请号:US15118239

    申请日:2015-01-27

    申请人: SONY CORPORATION

    发明人: Atsuo Hiroe

    摘要: A device and a method for determining a speech segment with a high degree of accuracy from a sound signal in which different sounds coexist are provided. Directional points indicating the direction of arrival of the sound signal are connected in the temporal direction, and a speech segment is detected. In this configuration, pattern classification is performed in accordance with directional characteristics with respect to the direction of arrival, and a directionality pattern and a null beam pattern are generated from the classification results. Also, an average null beam pattern is also generated by calculating the average of the null beam patterns at a time when a non-speech-like signal is input. Further, a threshold that is set at a slightly lower value than the average null beam pattern is calculated as the threshold to be used in detecting the local minimum point corresponding to the direction of arrival from each null beam pattern, and a local minimum point equal to or lower than the threshold is determined to be the point corresponding to the direction of arrival.

    Sound signal processing apparatus, sound signal processing method, and program
    4.
    发明授权
    Sound signal processing apparatus, sound signal processing method, and program 有权
    声音信号处理装置,声音信号处理方法和程序

    公开(公告)号:US09357298B2

    公开(公告)日:2016-05-31

    申请号:US14221598

    申请日:2014-03-21

    申请人: Sony Corporation

    发明人: Atsuo Hiroe

    摘要: A sound signal processing apparatus includes an observed signal analysis unit that receives as an observed signal a sound signal for channels obtained by a sound signal input unit formed of microphones and estimates a sound direction and a sound segment of a target sound which is sound to be extracted and a sound source extraction unit that receives the sound direction and sound segment of the target sound estimated by the observed signal analysis unit and extracts the sound signal for the target sound. The observed signal analysis unit includes a short time Fourier transform unit that generates an observed signal in time-frequency domain by applying short time Fourier transform to the sound signal for the channels received and a direction/segment estimation unit that receives the observed signal generated by the short time Fourier transform unit and detects the sound direction and sound segment of the target sound.

    摘要翻译: 声音信号处理装置包括观测信号分析单元,其接收作为观测信号的声音信号,该信号由由麦克风构成的声音信号输入单元获得的声道,并且估计声音的声音方向和声音段 提取的声源信号提取单元,接收由观测信号分析单元估计的目标声音的声音方向和声音段,并提取目标声音的声音信号。 观测信号分析单元包括短时傅里叶变换单元,其通过对所接收的信道的声音信号应用短时傅立叶变换来产生时间 - 频域中的观测信号,以及方向/段估计单元,其接收由 短时傅立叶变换单元,并检测目标声音的声音方向和声音段。