Invention Publication
- Patent Title: Speaker Verification with Multitask Speech Models
-
Application No.: US18167815Application Date: 2023-02-10
-
Publication No.: US20230260521A1Publication Date: 2023-08-17
- Inventor: Alanna Foster Slocum , Yiling Huang , Shelly Bensal , Quan Wang
- Applicant: Google LLC
- Applicant Address: US CA Mountain View
- Assignee: Google LLC
- Current Assignee: Google LLC
- Current Assignee Address: US CA Mountain View
- Main IPC: G10L17/18
- IPC: G10L17/18 ; G10L17/04 ; G10L17/06

Abstract:
A method includes obtaining a speaker identification (SID) model trained to predict speaker embeddings from utterances spoken by different speakers, the SID model includes a trained audio encoder and a trained SID head. The method also includes receiving a plurality of synthetic speech detection (SSD) training utterances that include a set of human-originated speech samples and a set of synthetic speech samples. The method also includes training, using the trained audio encoder, a SSD head on the SSD training utterances to learn to detect the presence of synthetic speech in audio encodings encoded by the trained audio encoder. The operations also include providing, for execution on a computing device, a multitask neural network model for performing both SID tasks and SSD tasks on input audio data in parallel.
Information query