-
公开(公告)号:US20250131940A1
公开(公告)日:2025-04-24
申请号:US18539764
申请日:2023-12-14
Applicant: Cisco Technology, Inc.
Inventor: Rafal Pilarczyk , Amir Salah Abdelsamie Abdelwahed , Hui-Ling Lu , Ivana Balic , Yusuf Ziya Isik , David Guoqing Zhang , Xuehong Mao , Samer Lutfi Hijazi
IPC: G10L21/043 , G10L19/00
Abstract: A data-driven audio codec system that involves producing multiple compressed streams comprising encoded information (e.g., codeword indices) at different time scales (time intervals or frequency). This may allow for separation of different properties of speech, such as content and aspects of style (prosody), into the different compressed streams without explicitly enforcing it, i.e., in an unsupervised manner. Speech audio is encoded to produce a plurality of encoded streams comprising encoded information for the speech audio at different time scales. The plurality of encoded streams are decoded to generate output audio.
-
公开(公告)号:US12149263B2
公开(公告)日:2024-11-19
申请号:US18079441
申请日:2022-12-12
Applicant: Cisco Technology, Inc.
Inventor: Yusuf Ziya Isik , Amir Salah Abdelsamie Abdelwahed , Xuehong Mao , Ivana M. Balic , Samer Lutfi Hijazi
Abstract: In some aspects, the techniques described herein relate to a method including: obtaining data to be compressed; determining a distance between the data to be compressed and each codeword of a plurality of codewords; selecting a predetermined number of codewords of the plurality of codewords based on the distance between the data to be compressed and each of the predetermined number of codewords; and generating compressed data, where the compressed data includes an indication of the predetermined number of codewords of the plurality of codewords.
-
公开(公告)号:US20240322942A1
公开(公告)日:2024-09-26
申请号:US18680660
申请日:2024-05-31
Applicant: Cisco Technology, Inc.
Inventor: Amir Salah Abdelsamie Abdelwahed , Ivana Balic , Yusuf Ziya Isik , Xuehong Mao , Samer Lutfi Hijazi
CPC classification number: H04L1/0041 , H04L1/0002 , H04L1/0045 , H04L69/22
Abstract: In some aspects, the techniques described herein relate to a method including: encoding a current data portion to generate an encoded current data portion for inclusion in a data packet; encoding, based upon content of the current data portion, a forward error correction data portion for a previous data portion to generate an encoded forward error correction data portion; generating the data packet including the encoded current data portion and the encoded forward error correction data portion; and providing the data packet to a receiver.
-
公开(公告)号:US12040894B1
公开(公告)日:2024-07-16
申请号:US18151616
申请日:2023-01-09
Applicant: Cisco Technology, Inc.
Inventor: Amir Salah Abdelsamie Abdelwahed , Ivana Balic , Yusuf Ziya Isik , Xuehong Mao , Samer Lutfi Hijazi
CPC classification number: H04L1/0041 , H04L1/0002 , H04L1/0045 , H04L69/22
Abstract: In some aspects, the techniques described herein relate to a method including: encoding a current data portion to generate an encoded current data portion for inclusion in a data packet; encoding, based upon content of the current data portion, a forward error correction data portion for a previous data portion to generate an encoded forward error correction data portion; generating the data packet including the encoded current data portion and the encoded forward error correction data portion; and providing the data packet to a receiver.
-
5.
公开(公告)号:US20240161765A1
公开(公告)日:2024-05-16
申请号:US17988376
申请日:2022-11-16
Applicant: Cisco Technology, Inc.
Inventor: Kamil Krzysztof Wojcicki , Xuehong Mao , David Guoqing Zhang , Samer Hijazi , Raul Alejandro Casas
IPC: G10L21/0208 , G06N20/00 , G10L25/78
CPC classification number: G10L21/0208 , G06N20/00 , G10L25/78
Abstract: In one example embodiment, speech signals are received from a user during a communication session. The received speech signals contain noise including speech of other individuals. The received speech signals are transformed by a machine learning model to produce transformed speech signals corresponding to the received speech signals with a reduced amount of the noise. The machine learning model is trained with speech of the user satisfying a noise threshold and collected during one or more communication sessions.
-
公开(公告)号:US20240371392A1
公开(公告)日:2024-11-07
申请号:US18773339
申请日:2024-07-15
Applicant: Cisco Technology, Inc.
Inventor: Samer Hijazi , Xuehong Mao , Raul Alejandro Casas , Kamil Krzysztof Wojcicki , Dror Maydan , Christopher Rowen
IPC: G10L21/0364 , G10L15/22 , G10L15/25 , G10L17/00 , G10L21/055 , G10L25/30
Abstract: Systems and methods are disclosed for audio enhancement. For example, methods may include accessing audio data; determining a window of audio samples based on the audio data; inputting the window of audio samples to a classifier to obtain a classification, in which the classifier includes a neural network and the classification takes a value from a set of multiple classes of audio; selecting, based on the classification, an audio enhancement network from a set of multiple audio enhancement networks; applying the selected audio enhancement network to the window of audio samples to obtain an enhanced audio segment, in which the selected audio enhancement network includes a neural network that has been trained using audio signals of a type associated with the classification; and storing, playing, or transmitting an enhanced audio signal based on the enhanced audio segment.
-
公开(公告)号:US20240195438A1
公开(公告)日:2024-06-13
申请号:US18079441
申请日:2022-12-12
Applicant: Cisco Technology, Inc.
Inventor: Yusuf Ziya Isik , Amir Salah Abdelsamie Abdelwahed , Xuehong Mao , Ivana M. Balic , Samer Lutfi Hijazi
IPC: H03M13/00
CPC classification number: H03M13/6312 , H03M13/6577
Abstract: In some aspects, the techniques described herein relate to a method including: obtaining data to be compressed; determining a distance between the data to be compressed and each codeword of a plurality of codewords; selecting a predetermined number of codewords of the plurality of codewords based on the distance between the data to be compressed and each of the predetermined number of codewords; and generating compressed data, where the compressed data includes an indication of the predetermined number of codewords of the plurality of codewords.
-
公开(公告)号:US20250131933A1
公开(公告)日:2025-04-24
申请号:US18539804
申请日:2023-12-14
Applicant: Cisco Technology, Inc.
Inventor: Amir Salah Abdelsamie Abdelwahed , Yusuf Ziya Isik , Xuehong Mao , Samir Ouelha , Samer Lutfi Hijazi
Abstract: A method of performing packet loss concealment in a neural audio encoder/decoder (codec) system. The method includes receiving an indication of a lost audio packet at a receive side of a neural network audio codec system that includes an audio encoder and an audio decoder, wherein the lost audio packet comprises an index of a codeword that is representative of a portion of speech audio presented to the audio encoder, predicting the index of the codeword in the lost packet to obtain a predicted index, deriving a predicted embedding vector from the predicted index, and decoding, by the audio decoder, the embedding vector to generate an audio output.
-
9.
公开(公告)号:US20250131919A1
公开(公告)日:2025-04-24
申请号:US18539791
申请日:2023-12-14
Applicant: Cisco Technology, Inc.
Inventor: Xuehong Mao , Samer Lutfi Hijazi , Christopher Rowen , Mathew Shaji Kavalekalam , Ivana Balic , Mengjun Leng , Yusuf Ziya Isik , Adam Ali Sabra , Amir Salah Abdelsamie Abdelwahed , Samir Ouelha , Mihailo Kolundzija
Abstract: A neural network audio codec system and related methods are provided. In one example, a method is provided comprising: obtaining speech audio to be encoded; applying the speech audio to an audio encoder that is part of a neural network audio codec system that includes the audio encoder and an audio decoder. The audio encoder and the audio decoder have been trained in an end-to-end manner. The speech audio is encoded with the audio encoder to generate embedding vectors that represent a snapshot of speech audio attributes over successive timeframes of the raw speech audio, and from the embedding vectors, codeword indices are generated to entries in a codebook. The codeword indices are then transmitted or stored for later retrieval and processing by the audio decoder.
-
公开(公告)号:US20240235727A1
公开(公告)日:2024-07-11
申请号:US18151616
申请日:2023-01-09
Applicant: Cisco Technology, Inc.
Inventor: Amir Salah Abdelsamie Abdelwahed , Ivana Balic , Yusuf Ziya Isik , Xuehong Mao , Samer Lutfi Hijazi
CPC classification number: H04L1/0041 , H04L1/0002 , H04L1/0045 , H04L69/22
Abstract: In some aspects, the techniques described herein relate to a method including: encoding a current data portion to generate an encoded current data portion for inclusion in a data packet; encoding, based upon content of the current data portion, a forward error correction data portion for a previous data portion to generate an encoded forward error correction data portion; generating the data packet including the encoded current data portion and the encoded forward error correction data portion; and providing the data packet to a receiver.
-
-
-
-
-
-
-
-
-