Patent search ap:("Samsung Electronics Co. Page Ltd.") AND inv:"Chou-Chang Yang"

1.

发明公开
SPEECH DENOISING NETWORKS USING SPEECH AND NOISE MODELING 审中-公开

公开(公告)号：US20240046946A1

公开(公告)日：2024-02-08

申请号：US18058104

申请日：2022-11-22

Applicant: Samsung Electronics Co., Ltd.

Inventor： Chou-Chang Yang , Ching-Hua Lee , Rakshith Sharma Srinivasa , Yashas Malur Saidutta , Yilin Shen , Hongxia Jin

IPC: G10L21/0232 , G10L15/06 , G10L15/02 , G10L25/18

CPC classification number: G10L21/0232 , G10L15/063 , G10L15/02 , G10L25/18 , G10L2021/02166

Abstract: A method includes obtaining, using at least one processing device, noisy speech signals and extracting, using the at least one processing device, acoustic features from the noisy speech signals. The method also includes receiving, using the at least one processing device, a predicted speech mask from a speech mask prediction model based on a first acoustic feature subset and receiving, using the at least one processing device, a predicted noise mask from a noise mask prediction model based on a second acoustic feature subset. The method further includes providing, using the at least one processing device, predicted speech features determined using the predicted speech mask and predicted noise features determined using the predicted noise mask to a filtering mask prediction model. In addition, the method includes generating, using the at least one processing device, a clean speech signal using a predicted filtering mask output by the filtering mask prediction model.

2.

发明公开
SYSTEM AND METHOD FOR MASK-BASED NEURAL BEAMFORMING FOR MULTI-CHANNEL SPEECH ENHANCEMENT 审中-公开

公开(公告)号：US20240331715A1

公开(公告)日：2024-10-03

申请号：US18457921

申请日：2023-08-29

Applicant: Samsung Electronics Co., Ltd.

Inventor： Ching-Hua Lee , Chou-Chang Yang , Yilin Shen , Hongxia Jin

IPC: G10L21/0224

CPC classification number: G10L21/0224 , G10L2021/02166

Abstract: A method includes receiving, during a first time window, a set of noisy audio signals from a plurality of audio input devices. The method also includes generating a noisy time-frequency representation based on the set of noisy audio signals. The method further includes providing the noisy time-frequency representation as an input to a mask estimation model trained to output a mask used to predict a clean time-frequency representation of clean speech audio from the noisy time-frequency representation. The method also includes determining beamforming filter weights based on the mask. The method further includes applying the beamforming filter weights to the noisy time-frequency representation to isolate the clean speech audio from the set of noisy audio signals. In addition, the method includes outputting the clean speech audio.

3.

发明授权
Speech denoising networks using speech and noise modeling 有权

公开(公告)号：US12260874B2

公开(公告)日：2025-03-25

申请号：US18058104

申请日：2022-11-22

Applicant: Samsung Electronics Co., Ltd.

Inventor： Chou-Chang Yang , Ching-Hua Lee , Rakshith Sharma Srinivasa , Yashas Malur Saidutta , Yilin Shen , Hongxia Jin

IPC: G10L21/0232 , G10L15/02 , G10L15/06 , G10L21/0216 , G10L25/18

Abstract: A method includes obtaining, using at least one processing device, noisy speech signals and extracting, using the at least one processing device, acoustic features from the noisy speech signals. The method also includes receiving, using the at least one processing device, a predicted speech mask from a speech mask prediction model based on a first acoustic feature subset and receiving, using the at least one processing device, a predicted noise mask from a noise mask prediction model based on a second acoustic feature subset. The method further includes providing, using the at least one processing device, predicted speech features determined using the predicted speech mask and predicted noise features determined using the predicted noise mask to a filtering mask prediction model. In addition, the method includes generating, using the at least one processing device, a clean speech signal using a predicted filtering mask output by the filtering mask prediction model.

4.

发明公开
SYSTEM AND METHOD FOR KEYWORD SPOTTING IN NOISY ENVIRONMENTS 审中-公开

公开(公告)号：US20240339123A1

公开(公告)日：2024-10-10

申请号：US18470788

申请日：2023-09-20

Applicant: Samsung Electronics Co., Ltd.

Inventor： Chou-Chang Yang , Yashas Malur Saidutta , Rakshith Sharma Srinivasa , Ching-Hua Lee , Yilin Shen , Hongxia Jin

IPC: G10L21/0232 , G10L15/06 , G10L15/08 , G10L25/18

CPC classification number: G10L21/0232 , G10L15/063 , G10L15/08 , G10L25/18 , G10L2015/088

Abstract: A method includes receiving an audio input and generating a noisy time-frequency representation based on the audio input. The method also includes providing the noisy time-frequency representation to a noise management model trained to predict a denoising mask and a signal presence probability (SPP) map indicating a likelihood of a presence of speech. The method further includes determining an enhanced spectrogram using the denoising mask and the noisy time-frequency representation. The method also includes providing the enhanced spectrogram and the SPP map as inputs to a keyword classification model trained to determine a likelihood of a keyword being present in the audio input. In addition, the method includes, responsive to determining that a keyword is in the audio input, transmitting the audio input to a downstream application associated with the keyword.

5.

发明公开
SYSTEM AND METHOD FOR KEYWORD FALSE ALARM REDUCTION 审中-公开

公开(公告)号：US20240185850A1

公开(公告)日：2024-06-06

申请号：US18352601

申请日：2023-07-14

Applicant: Samsung Electronics Co., Ltd.

Inventor： Rakshith Sharma Srinivasa , Yashas Malur Saidutta , Ching-Hua Lee , Chou-Chang Yang , Yilin Shen , Hongxia Jin

IPC: G10L15/22 , G10L15/02 , G10L15/06 , G10L15/18 , G10L25/78

CPC classification number: G10L15/22 , G10L15/02 , G10L15/063 , G10L15/18 , G10L25/78 , G10L2015/088 , G10L2015/223

Abstract: A method includes extracting, using a keyword detection model, audio features from audio data. The method also includes processing the audio features by a first layer of the keyword detection model configured to predict a first likelihood that the audio data includes speech. The method also includes processing the audio features by a second layer of the keyword detection model configured to predict a second likelihood that the audio data includes keyword-like speech. The method also includes processing the audio features by a third layer of the keyword detection model configured to predict a third likelihood, for each of a plurality of possible keywords, that the audio data includes the keyword. The method also includes identifying a keyword included in the audio data. The method also includes generating instructions to perform an action based at least in part on the identified keyword.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification