专利检索 ap:("Sanas.ai Inc.") AND inv:"Maxim Serebryakov" 第 1 页

1.

发明授权
Methods for neural network-based voice enhancement and systems thereof 有权

公开(公告)号：US12125496B1

公开(公告)日：2024-10-22

申请号：US18644959

申请日：2024-04-24

申请人： Sanas.ai Inc.

发明人： Shawn Zhang , Lukas Pfeifenberger , Jason Wu , Piotr Dura , David Braude , Bajibabu Bollepalli , Alvaro Escudero , Gokce Keskin , Ankita Jha , Maxim Serebryakov

IPC分类号： G10L15/00 , G10L15/02 , G10L15/06 , G10L21/0232 , G10L25/30 , G10L15/16 , G10L15/22

CPC分类号： G10L21/0232 , G10L15/02 , G10L15/063 , G10L25/30 , G10L15/16 , G10L15/22

摘要： The disclosed technology relates to methods, voice enhancement systems, and non-transitory computer readable media for real-time voice enhancement. In some examples, input audio data including foreground speech content, non-content elements, and speech characteristics is fragmented into input speech frames. The input speech frames are converted to low-dimensional representations of the input speech frames. One or more of the fragmentation or the conversion is based on an application of a first trained neural network to the input audio data. The low-dimensional representations of the input speech frames omit one or more of the non-content elements. A second trained neural network is applied to the low-dimensional representations of the input speech frames to generate target speech frames. The target speech frames are combined to generate output audio data. The output audio data further includes one or more portions of the foreground speech content and one or more of the speech characteristics.

2.

发明授权
Real-time accent conversion model 有权

公开(公告)号：US11948550B2

公开(公告)日：2024-04-02

申请号：US17460145

申请日：2021-08-27

申请人： Sanas.ai Inc.

发明人： Maxim Serebryakov , Shawn Zhang

IPC分类号： G10L13/02 , G06N20/20 , G10L15/02 , G10L25/27

CPC分类号： G10L13/02 , G06N20/20 , G10L15/02 , G10L25/27 , G10L2015/022

摘要： Techniques for real-time accent conversion are described herein. An example computing device receives an indication of a first accent and a second accent. The computing device further receives, via at least one microphone, speech content having the first accent. The computing device is configured to derive, using a first machine-learning algorithm trained with audio data including the first accent, a linguistic representation of the received speech content having the first accent. The computing device is configured to, based on the derived linguistic representation of the received speech content having the first accent, synthesize, using a second machine learning-algorithm trained with (i) audio data comprising the first accent and (ii) audio data including the second accent, audio data representative of the received speech content having the second accent. The computing device is configured to convert the synthesized audio data into a synthesized version of the received speech content having the second accent.

3.

发明申请
Real-Time Accent Conversion Model 有权

公开(公告)号：US20220358903A1

公开(公告)日：2022-11-10

申请号：US17460145

申请日：2021-08-27

申请人： Sanas.ai Inc.

发明人： Maxim Serebryakov , Shawn Zhang

IPC分类号： G10L13/02 , G10L15/02 , G10L25/27 , G06N20/20

摘要： Techniques for real-time accent conversion are described herein. An example computing device receives an indication of a first accent and a second accent. The computing device further receives, via at least one microphone, speech content having the first accent. The computing device is configured to derive, using a first machine-learning algorithm trained with audio data including the first accent, a linguistic representation of the received speech content having the first accent. The computing device is configured to, based on the derived linguistic representation of the received speech content having the first accent, synthesize, using a second machine learning-algorithm trained with (i) audio data comprising the first accent and (ii) audio data including the second accent, audio data representative of the received speech content having the second accent. The computing device is configured to convert the synthesized audio data into a synthesized version of the received speech content having the second accent.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类