Patent search ap:("Beijing Youzhuju Network Technology Co. Page Ltd.") AND inv:"Linhao Dong"

1.

发明授权
Method and device of generating acoustic features, speech model training, and speech recognition 有权

公开(公告)号：US12067987B2

公开(公告)日：2024-08-20

申请号：US18427538

申请日：2024-01-30

Applicant: Beijing Youzhuju Network Technology Co., Ltd.

Inventor： Linhao Dong , Zejun Ma

IPC: G10L15/02 , G10L15/06 , G10L15/22

CPC classification number: G10L15/22 , G10L15/063

Abstract: The present disclosure discloses a method and device of generating acoustic features, speech model training, and speech recognition. By acquiring the acoustic information vector of the current speech frame and the information weight of the current speech frame, and according to the accumulated information weight corresponding to the previous speech frame, the retention rate corresponding to the current speech frame, and the information weight of the current speech frame, the accumulated information weight corresponding to the current speech frame can be obtained. The retention rate is the difference between 1 and a leakage rate.

2.

发明授权
Method, apparatus, device, and storage medium for speaker change point detection 有权

公开(公告)号：US12039981B2

公开(公告)日：2024-07-16

申请号：US18394143

申请日：2023-12-22

Applicant: Beijing Youzhuju Network Technology Co., Ltd.

Inventor： Linhao Dong , Zhiyun Fan , Zejun Ma

IPC: G10L21/0272 , G10L17/04

CPC classification number: G10L17/04

Abstract: A method, apparatus, device, and storage medium for speaker change point detection, the method including: acquiring target voice data to be detected; and extracting an acoustic feature characterizing acoustic information of the target voice data from the target voice data; encoding the acoustic feature to obtain speaker characterization vectors at a voice frame level of the target voice data; integrating and firing the speaker characterization vectors at the voice frame level of the target voice data based on a continuous integrate-and-fire CIF mechanism, to obtain a sequence of speaker characterizations bounded by speaker change points in the target voice data; and determining a timestamp corresponding to the speaker change points, according to the sequence of the speaker characterizations bounded by the speaker change points in the target voice data.

Patent Agency Ranking