Patent search ap:("TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED") AND inv:"Meng Yu" Page 1

1.

发明授权
Method and apparatus for speech recognition, and electronic device 有权

公开(公告)号：US11217229B2

公开(公告)日：2022-01-04

申请号：US16921537

申请日：2020-07-06

Applicant: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventor： Yi Gao , Ji Meng Zheng , Meng Yu , Min Luo

IPC: G10L15/08 , G10L21/0208 , H04R1/40 , H04R3/00 , H04R3/04 , H04R5/027

Abstract: A speech recognition method, apparatus, a computer device and an electronic device for recognizing speech. The method includes receiving an audio signal obtained by a microphone array; performing a beamforming processing on the audio signal in a plurality of target directions to obtain a plurality of beam signals; performing a speech recognition on each of the plurality of beam signals to obtain a plurality of speech recognition results corresponding to the plurality of beam signals; and determining a speech recognition result of the audio signal based on the plurality of speech recognition results of the plurality of beam signals.

2.

发明申请
SPEECH SIGNAL PROCESSING MODEL TRAINING METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM 审中-公开

公开(公告)号：US20200051549A1

公开(公告)日：2020-02-13

申请号：US16655548

申请日：2019-10-17

Applicant: Tencent Technology (Shenzhen) Company Limited

Inventor： Lianwu Chen , Meng Yu , Min Luo , Dan Su

IPC: G10L15/06 , G10L15/16 , G10L15/22 , G10L15/183 , G06N3/08 , G06N3/04

Abstract: Embodiments of the present invention provide a speech signal processing model training method, an electronic device and a storage medium. The embodiments of the present invention determines a target training loss function based on a training loss function of each of one or more speech signal processing tasks; inputs a task input feature of each speech signal processing task into a starting multi-task neural network, and updates model parameters of a shared layer and each of one or more task layers of the starting multi-task neural network corresponding to the one or more speech signal processing tasks by minimizing the target training loss function as a training objective, until the starting multi-task neural network converges, to obtain a speech signal processing model.

3.

发明授权
Speech noise reduction method and apparatus, computing device, and computer-readable storage medium 有权

公开(公告)号：US12057135B2

公开(公告)日：2024-08-06

申请号：US17227123

申请日：2021-04-09

Applicant: Tencent Technology (Shenzhen) Company Limited

Inventor： Xuan Ji , Meng Yu

IPC: G10L21/0216 , G10L21/02 , G10L21/0208 , G10L21/0232 , G10L25/78 , G10L25/84

CPC classification number: G10L21/0216 , G10L21/0208 , G10L21/0232 , G10L25/78 , G10L25/84 , G10L21/02

Abstract: This application discloses a speech noise reduction method performed by a computing device. The method includes: obtaining a noisy speech signal, the noisy speech signal including a pure speech signal and a noise signal; estimating a posteriori signal-to-noise ratio and a priori signal-to-noise ratio of the noisy speech signal; determining a speech/noise likelihood ratio in a Bark domain based on the estimated posteriori signal-to-noise ratio and the estimated priori signal-to-noise ratio; estimating a priori speech existence probability based on the determined speech/noise likelihood ratio; determining a gain based on the estimated posteriori signal-to-noise ratio, the estimated priori signal-to-noise ratio, and the estimated priori speech existence probability, the gain being a frequency domain transfer function used for converting the noisy speech signal into an estimation of the pure speech signal; and exporting the estimation of the pure speech signal from the noisy speech signal based on the gain.

4.

发明授权
Sound acquisition component array and sound acquisition device 有权

公开(公告)号：US11856376B2

公开(公告)日：2023-12-26

申请号：US17319024

申请日：2021-05-12

Applicant: Tencent Technology (Shenzhen) Company Limited

Inventor： Jimeng Zheng , Yi Gao , Xuan Ji , Weiwei Li , Meng Yu , Kai Xia , Jun Feng , Zhu Chen , Hongyang Chen , Wenbin Yang , Yu Wang , Yong Liu

IPC: H04R3/00

CPC classification number: H04R3/005

Abstract: This application discloses a sound acquisition component array, including: two first sound acquisition components, two second sound acquisition components, and two third sound acquisition components. The two second sound acquisition components are located at a first side of a line connecting the two first sound acquisition components, and the two third sound acquisition components are located at a second side of the connecting line that is opposite to the first side of the connecting line; the two second sound acquisition components are symmetrical about a perpendicular bisector of the connecting line, and the two third sound acquisition components are symmetrical about the perpendicular bisector; and a distance between the two first sound acquisition components, a distance between the two second sound acquisition components, and a distance between the two third sound acquisition components are respectively different from one another along a direction defined by the connecting line.

5.

发明授权
Multi-person speech separation method and apparatus using a generative adversarial network model 有权

公开(公告)号：US11450337B2

公开(公告)日：2022-09-20

申请号：US17023829

申请日：2020-09-17

Applicant: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventor： Lianwu Chen , Meng Yu , Yanmin Qian , Dan Su , Dong Yu

IPC: G10L21/0272 , G06N3/04 , G06N3/08 , G10L25/30 , G10L25/51

Abstract: A multi-person speech separation method is provided for a terminal. The method includes extracting a hybrid speech feature from a hybrid speech signal requiring separation, N human voices being mixed in the hybrid speech signal, N being a positive integer greater than or equal to 2; extracting a masking coefficient of the hybrid speech feature by using a generative adversarial network (GAN) model, to obtain a masking matrix corresponding to the N human voices, wherein the GAN model comprises a generative network model and an adversarial network model; and performing a speech separation on the masking matrix corresponding to the N human voices and the hybrid speech signal by using the GAN model, and outputting N separated speech signals corresponding to the N human voices.

6.

发明申请
INTER-CHANNEL FEATURE EXTRACTION METHOD, AUDIO SEPARATION METHOD AND APPARATUS, AND COMPUTING DEVICE 有权

公开(公告)号：US20210375294A1

公开(公告)日：2021-12-02

申请号：US17401125

申请日：2021-08-12

Applicant: Tencent Technology (Shenzhen) Company Limited

Inventor： Rongzhi Gu , Shixiong Zhang , Lianwu Chen , Yong Xu , Meng Yu , Dan Su , Dong Yu

IPC: G10L19/008 , G10L25/30 , G10L25/03

Abstract: This application relates to a method of extracting an inter channel feature from a multi-channel multi-sound source mixed audio signal performed at a computing device. The method includes: transforming one channel component of a multi-channel multi-sound source mixed audio signal into a single-channel multi-sound source mixed audio representation in a feature space; performing a two-dimensional dilated convolution on the multi-channel multi-sound source mixed audio signal to extract inter-channel features; performing a feature fusion on the single-channel multi-sound source mixed audio representation and the inter-channel features; estimating respective weights of sound sources in the single-channel multi-sound source mixed audio representation based on a fused multi-channel multi-sound source mixed audio feature; obtaining respective representations of the plurality of sound sources according to the single-channel multi-sound source mixed audio representation and the respective weights; and transforming the respective representations of the sound sources into respective audio signals of the plurality of sound sources.

7.

发明授权
Inter-channel feature extraction method, audio separation method and apparatus, and computing device 有权

公开(公告)号：US11908483B2

公开(公告)日：2024-02-20

申请号：US17401125

申请日：2021-08-12

Applicant: Tencent Technology (Shenzhen) Company Limited

Inventor： Rongzhi Gu , Shixiong Zhang , Lianwu Chen , Yong Xu , Meng Yu , Dan Su , Dong Yu

IPC: G10L19/008 , G10L25/03 , G10L25/30 , H04S3/02 , H04S5/00

CPC classification number: G10L19/008 , G10L25/03 , G10L25/30 , H04S3/02 , H04S5/00

Abstract: This application relates to a method of extracting an inter channel feature from a multi-channel multi-sound source mixed audio signal performed at a computing device. The method includes: transforming one channel component of a multi-channel multi-sound source mixed audio signal into a single-channel multi-sound source mixed audio representation in a feature space; performing a two-dimensional dilated convolution on the multi-channel multi-sound source mixed audio signal to extract inter-channel features; performing a feature fusion on the single-channel multi-sound source mixed audio representation and the inter-channel features; estimating respective weights of sound sources in the single-channel multi-sound source mixed audio representation based on a fused multi-channel multi-sound source mixed audio feature; obtaining respective representations of the plurality of sound sources according to the single-channel multi-sound source mixed audio representation and the respective weights; and transforming the respective representations of the sound sources into respective audio signals of the plurality of sound sources.

8.

发明授权
Training method of speech signal processing model with shared layer, electronic device and storage medium 有权

公开(公告)号：US11158304B2

公开(公告)日：2021-10-26

申请号：US16655548

申请日：2019-10-17

Applicant: Tencent Technology (Shenzhen) Company Limited

Inventor： Lianwu Chen , Meng Yu , Min Luo , Dan Su

IPC: G10L15/16 , G10L15/06 , G06N3/04 , G06N3/08 , G10L15/183 , G10L15/22

Abstract: Embodiments of the present invention provide a speech signal processing model training method, an electronic device and a storage medium. The embodiments of the present invention determines a target training loss function based on a training loss function of each of one or more speech signal processing tasks; inputs a task input feature of each speech signal processing task into a starting multi-task neural network, and updates model parameters of a shared layer and each of one or more task layers of the starting multi-task neural network corresponding to the one or more speech signal processing tasks by minimizing the target training loss function as a training objective, until the starting multi-task neural network converges, to obtain a speech signal processing model.

9.

发明申请
SOUND ACQUISITION COMPONENT ARRAY AND SOUND ACQUISITION DEVICE 有权

公开(公告)号：US20210266664A1

公开(公告)日：2021-08-26

申请号：US17319024

申请日：2021-05-12

Applicant: Tencent Technology (Shenzhen) Company Limited

Inventor： Jimeng Zheng , Yi Gao , Xuan Ji , Weiwei Li , Meng Yu , Kai Xia , Jun Feng , Zhu Chen , Hongyang Chen , Wenbin Yang , Yu Wang , Yong Liu

IPC: H04R3/00

Abstract: This application discloses a sound acquisition component array, including: two first sound acquisition components, two second sound acquisition components, and two third sound acquisition components. The two second sound acquisition components are located at a first side of a line connecting the two first sound acquisition components, and the two third sound acquisition components are located at a second side of the connecting line that is opposite to the first side of the connecting line; the two second sound acquisition components are symmetrical about a perpendicular bisector of the connecting line, and the two third sound acquisition components are symmetrical about the perpendicular bisector; and a distance between the two first sound acquisition components, a distance between the two second sound acquisition components, and a distance between the two third sound acquisition components are respectively different from one another along a direction defined by the connecting line.

10.

发明授权
Multi-register-based speech detection method and related apparatus, and storage medium 有权

公开(公告)号：US12051441B2

公开(公告)日：2024-07-30

申请号：US17944067

申请日：2022-09-13

Applicant: Tencent Technology (Shenzhen) Company Limited

Inventor： Jimeng Zheng , Lianwu Chen , Weiwei Li , Zhiyi Duan , Meng Yu , Dan Su , Kaiyu Jiang

IPC: G10L25/84 , G06T7/20 , G10L17/02 , G10L17/22 , G10L21/028 , G10L25/21

CPC classification number: G10L25/84 , G06T7/20 , G10L17/02 , G10L17/22 , G10L21/028 , G10L25/21 , G06T2207/30201

Abstract: This application discloses a multi-sound area-based speech detection method and related apparatus, and a storage medium, which is applied to the field of artificial intelligence. The method includes: obtaining sound area information corresponding to N sound areas including multiple users speaking simultaneously; generating a control signal corresponding to each target detection sound area according to user information corresponding to the target detection sound area; processing multi-user speech input signals by using the control signals, to obtain a speech output signal corresponding to each target detection sound area; generating a speech detection result of the target detection sound area according to the speech output signal corresponding to the target detection sound area; and selecting, among the multiple users, a main speaker based on the user information, the speech output signals and speech detection results of multiple users in the N sound areas.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification