Patent search ap:("Google LLC") AND inv:"Basilio Garcia Castillo" Page 1

1.

发明授权
Emitting word timings with end-to-end models 有权

公开(公告)号：US12027154B2

公开(公告)日：2024-07-02

申请号：US18167050

申请日：2023-02-09

Applicant: Google LLC

Inventor： Tara N. Sainath , Basilio Garcia Castillo , David Rybach , Trevor Strohman , Ruoming Pang

IPC: G10L25/30 , G10L15/06 , G10L25/78

CPC classification number: G10L15/063 , G10L25/30 , G10L25/78

Abstract: A method includes receiving a training example that includes audio data representing a spoken utterance and a ground truth transcription. For each word in the spoken utterance, the method also includes inserting a placeholder symbol before the respective word identifying a respective ground truth alignment for a beginning and an end of the respective word, determining a beginning word piece and an ending word piece, and generating a first constrained alignment for the beginning word piece and a second constrained alignment for the ending word piece. The first constrained alignment is aligned with the ground truth alignment for the beginning of the respective word and the second constrained alignment is aligned with the ground truth alignment for the ending of the respective word. The method also includes constraining an attention head of a second pass decoder by applying the first and second constrained alignments.

2.

发明授权
Privacy-aware meeting room transcription from audio-visual stream 有权

公开(公告)号：US12118123B2

公开(公告)日：2024-10-15

申请号：US17755892

申请日：2019-11-18

Applicant: Google LLC

Inventor： Oliver Siohan , Takaki Makino , Richard Rose , Otavio Braga , Hank Liao , Basilio Garcia Castillo

IPC: G06F21/62 , G10L17/02 , H04L12/18

CPC classification number: G06F21/6254 , G10L17/02 , H04L12/1831

Abstract: A method for a privacy-aware transcription includes receiving audio-visual signal including audio data and image data for a speech environment and a privacy request from a participant in the speech environment where the privacy request indicates a privacy condition of the participant. The method further includes segmenting the audio data into a plurality of segments. For each segment, the method includes determining an identity of a speaker of a corresponding segment of the audio data based on the image data and determining whether the identity of the speaker of the corresponding segment includes the participant associated with the privacy condition. When the identity of the speaker of the corresponding segment includes the participant, the method includes applying the privacy condition to the corresponding segment. The method also includes processing the plurality of segments of the audio data to determine a transcript for the audio data.

3.

发明公开
PRIVACY-AWARE MEETING ROOM TRANSCRIPTION FROM AUDIO-VISUAL STREAM 审中-公开

公开(公告)号：US20240104247A1

公开(公告)日：2024-03-28

申请号：US18535214

申请日：2023-12-11

Applicant: Google LLC

Inventor： Oliver Siohan , Takaki Makino , Richard Rose , Otavio Braga , Hank Liao , Basilio Garcia Castillo

IPC: G06F21/62 , G10L17/02 , H04L12/18

CPC classification number: G06F21/6254 , G10L17/02 , H04L12/1831

Abstract: A method for a privacy-aware transcription includes receiving audio-visual signal including audio data and image data for a speech environment and a privacy request from a participant in the speech environment where the privacy request indicates a privacy condition of the participant. The method further includes segmenting the audio data into a plurality of segments. For each segment, the method includes determining an identity of a speaker of a corresponding segment of the audio data based on the image data and determining whether the identity of the speaker of the corresponding segment includes the participant associated with the privacy condition. When the identity of the speaker of the corresponding segment includes the participant, the method includes applying the privacy condition to the corresponding segment. The method also includes processing the plurality of segments of the audio data to determine a transcript for the audio data.

4.

发明公开
Emitting Word Timings with End-to-End Models 审中-公开

公开(公告)号：US20240321263A1

公开(公告)日：2024-09-26

申请号：US18680797

申请日：2024-05-31

Applicant: Google LLC

Inventor： Tara N. Sainath , Basilio Garcia Castillo , David Rybach , Trevor Strohman , Ruoming Pang

IPC: G10L15/06 , G10L25/30 , G10L25/78

CPC classification number: G10L15/063 , G10L25/30 , G10L25/78

Abstract: A method includes receiving a training example that includes audio data representing a spoken utterance and a ground truth transcription. For each word in the spoken utterance, the method also includes inserting a placeholder symbol before the respective word identifying a respective ground truth alignment for a beginning and an end of the respective word, determining a beginning word piece and an ending word piece, and generating a first constrained alignment for the beginning word piece and a second constrained alignment for the ending word piece. The first constrained alignment is aligned with the ground truth alignment for the beginning of the respective word and the second constrained alignment is aligned with the ground truth alignment for the ending of the respective word. The method also includes constraining an attention head of a second pass decoder by applying the first and second constrained alignments.

5.

发明公开
Emitting Word Timings with End-to-End Models 审中-公开

公开(公告)号：US20230206907A1

公开(公告)日：2023-06-29

申请号：US18167050

申请日：2023-02-09

Applicant: Google LLC

Inventor： Tara N Sainath , Basilio Garcia Castillo , David Rybach , Trevor Strohman , Ruoming Pang

IPC: G10L15/06 , G10L25/30 , G10L25/78

CPC classification number: G10L15/063 , G10L25/30 , G10L25/78

Abstract: A method includes receiving a training example that includes audio data representing a spoken utterance and a ground truth transcription. For each word in the spoken utterance, the method also includes inserting a placeholder symbol before the respective word identifying a respective ground truth alignment for a beginning and an end of the respective word, determining a beginning word piece and an ending word piece, and generating a first constrained alignment for the beginning word piece and a second constrained alignment for the ending word piece. The first constrained alignment is aligned with the ground truth alignment for the beginning of the respective word and the second constrained alignment is aligned with the ground truth alignment for the ending of the respective word. The method also includes constraining an attention head of a second pass decoder by applying the first and second constrained alignments.

Patent Agency Ranking