Patent search ap:("Google LLC") AND inv:"Basillo Garcia Castillo" Page 1

1.

发明申请
Rescoring Automatic Speech Recognition Hypotheses Using Audio-Visual Matching 有权

公开(公告)号：US20220392439A1

公开(公告)日：2022-12-08

申请号：US17755972

申请日：2019-11-18

Applicant: Google LLC

Inventor： Olivier Siohan , Takaki Makino , Richard Rose , Otavio Braga , Hank Liao , Basillo Garcia Castillo

IPC: G10L15/08 , G10L13/02 , G10L15/25 , G06V20/40 , G06V40/16 , G10L15/06 , G06V10/774 , G10L15/22 , G10L15/30 , G10L25/57

Abstract: A method (400) includes receiving audio data (112) corresponding to an utterance (101) spoken by a user (10), receiving video data (114) representing motion of lips of the user while the user was speaking the utterance, and obtaining multiple candidate transcriptions (135) for the utterance based on the audio data. For each candidate transcription of the multiple candidate transcriptions, the method also includes generating a synthesized speech representation (145) of the corresponding candidate transcription and determining an agreement score (155) indicating a likelihood that the synthesized speech representation matches the motion of the lips of the user while the user speaks the utterance. The method also includes selecting one of the multiple candidate transcriptions for the utterance as a speech recognition output (175) based on the agreement scores determined for the multiple candidate transcriptions for the utterance.

Patent Agency Ranking