-
公开(公告)号:US12272371B1
公开(公告)日:2025-04-08
申请号:US17364805
申请日:2021-06-30
Applicant: Amazon Technologies, Inc.
Inventor: Ritwik Giri , Shrikant Venkataramani , Jean-Marc Valin , Mehmet Umut Isik , Arvindh Krishnaswamy
IPC: G06F17/00 , G06N20/00 , G10L21/013 , G10L21/0364 , G10L21/038
Abstract: Real-time audio enhancement for a target speaker may be performed. An embedding of a sample of speaker audio is created using a trained neural network that performs voice identification. The embedding is then concatenated with the input features of a trained machine learning model for audio enhancement. The audio enhancement model can recognize and enhance a target speaker's speech in a real-time implementation, as the embedding is in the same feature space of the audio enhancement model.