- 专利标题: SYSTEM AND METHOD FOR AUTOMATIC ALIGNMENT OF PHONETIC CONTENT FOR REAL-TIME ACCENT CONVERSION
-
申请号: US18754280申请日: 2024-06-26
-
公开(公告)号: US20240347070A1公开(公告)日: 2024-10-17
- 发明人: Lukas PFEIFENBERGER , Shawn Zhang
- 申请人: Sanas.ai Inc.
- 申请人地址: US CA Palo Alto
- 专利权人: Sanas.ai Inc.
- 当前专利权人: Sanas.ai Inc.
- 当前专利权人地址: US CA Palo Alto
- 主分类号: G10L21/007
- IPC分类号: G10L21/007 ; G06F3/16 ; G10L15/02 ; G10L15/06 ; G10L15/16
摘要:
The disclosed technology relates to methods, accent conversion systems, and non-transitory computer readable media for real-time accent conversion. In some examples, a set of phonetic embedding vectors is obtained for phonetic content representing a source accent and obtained from input audio data. A trained machine learning model is applied to the set of phonetic embedding vectors to generate a set of transformed phonetic embedding vectors corresponding to phonetic characteristics of speech data in a target accent. An alignment is determined by maximizing a cosine distance between the set of phonetic embedding vectors and the set of transformed phonetic embedding vectors. The speech data is then aligned to the phonetic content based on the determined alignment to generate output audio data representing the target accent. The disclosed technology transforms phonetic characteristics of a source accent to match the target accent more closely for efficient and seamless accent conversion in real-time applications.
公开/授权文献
信息查询