Patent search ap:("Electronics AND Telecommunications Research Institute") AND inv:"Kiyoung PARK" Page 1

1.

发明公开
MULTI-MODAL VOICE RECOGNITION SYSTEM AND METHOD FOR CONVERSATION SUMMARIZATION 审中-公开

公开(公告)号：US20240203398A1

公开(公告)日：2024-06-20

申请号：US18540594

申请日：2023-12-14

Applicant: Electronics and Telecommunications Research Institute

Inventor： Jeom Ja KANG , Kiyoung PARK , Hwajeon SONG

IPC: G10L15/02 , G06N3/02 , G06V40/20 , G10L15/26

CPC classification number: G10L15/02 , G06N3/02 , G06V40/28 , G10L15/26

Abstract: Disclosed herein is a voice recognition system with an enhanced summarization function according to the present invention. The voice recognition system include: an audio feature extractor configured to extract a voice feature from an audio signal to generate a feature vector; a salience extractor configured to extract a importance of speech from at least one of the audio signal or a video signal to generate an importance vector; and a neural network configured to output a recognition result based on the feature vector and the importance vector, in which the recognition result is output by masking some.

Patent Agency Ranking