-
公开(公告)号:US20230117129A1
公开(公告)日:2023-04-20
申请号:US17504726
申请日:2021-10-19
Applicant: Cisco Technology, Inc.
Inventor: Ali Mouline , Christopher Rowen , David Guoqing Zhang , Francis Anthony Kurupacheril
Abstract: Presented herein are techniques in which a device detecting a phrase spoken in an online collaboration session between a plurality of users, the phrase being spoken by a first user to one or more second users. The device determines that the phrase indicates an issue with a quality of user experience of the online collaboration session, labels a log of metrics associated with the online collaboration session with a time stamp corresponding to a time when the phrase was spoken, to provide a labeled log of metrics; and performs one or more actions to improve the user experience based on detecting the phrase.
-
公开(公告)号:US20250131940A1
公开(公告)日:2025-04-24
申请号:US18539764
申请日:2023-12-14
Applicant: Cisco Technology, Inc.
Inventor: Rafal Pilarczyk , Amir Salah Abdelsamie Abdelwahed , Hui-Ling Lu , Ivana Balic , Yusuf Ziya Isik , David Guoqing Zhang , Xuehong Mao , Samer Lutfi Hijazi
IPC: G10L21/043 , G10L19/00
Abstract: A data-driven audio codec system that involves producing multiple compressed streams comprising encoded information (e.g., codeword indices) at different time scales (time intervals or frequency). This may allow for separation of different properties of speech, such as content and aspects of style (prosody), into the different compressed streams without explicitly enforcing it, i.e., in an unsupervised manner. Speech audio is encoded to produce a plurality of encoded streams comprising encoded information for the speech audio at different time scales. The plurality of encoded streams are decoded to generate output audio.
-
公开(公告)号:US12230262B2
公开(公告)日:2025-02-18
申请号:US17504726
申请日:2021-10-19
Applicant: Cisco Technology, Inc.
Inventor: Ali Mouline , Christopher Rowen , David Guoqing Zhang , Francis Anthony Kurupacheril
IPC: G10L15/22 , G10L15/08 , G10L15/30 , H04L65/401 , G06F40/30 , H04L65/403 , H04L65/80
Abstract: Presented herein are techniques in which a device detecting a phrase spoken in an online collaboration session between a plurality of users, the phrase being spoken by a first user to one or more second users. The device determines that the phrase indicates an issue with a quality of user experience of the online collaboration session, labels a log of metrics associated with the online collaboration session with a time stamp corresponding to a time when the phrase was spoken, to provide a labeled log of metrics; and performs one or more actions to improve the user experience based on detecting the phrase.
-
公开(公告)号:US20250087215A1
公开(公告)日:2025-03-13
申请号:US18960064
申请日:2024-11-26
Applicant: Cisco Technology, Inc.
Inventor: Ali Mouline , Christopher Rowen , David Guoqing Zhang , Francis Anthony Kurupacheril
IPC: G10L15/22 , G10L15/08 , G10L15/30 , H04L65/401
Abstract: Presented herein are techniques in which a device detecting a phrase spoken in an online collaboration session between a plurality of users, the phrase being spoken by a first user to one or more second users. The device determines that the phrase indicates an issue with a quality of user experience of the online collaboration session, labels a log of metrics associated with the online collaboration session with a time stamp corresponding to a time when the phrase was spoken; and performs one or more actions to improve the user experience based on detecting the phrase.
-
5.
公开(公告)号:US20240161765A1
公开(公告)日:2024-05-16
申请号:US17988376
申请日:2022-11-16
Applicant: Cisco Technology, Inc.
Inventor: Kamil Krzysztof Wojcicki , Xuehong Mao , David Guoqing Zhang , Samer Hijazi , Raul Alejandro Casas
IPC: G10L21/0208 , G06N20/00 , G10L25/78
CPC classification number: G10L21/0208 , G06N20/00 , G10L25/78
Abstract: In one example embodiment, speech signals are received from a user during a communication session. The received speech signals contain noise including speech of other individuals. The received speech signals are transformed by a machine learning model to produce transformed speech signals corresponding to the received speech signals with a reduced amount of the noise. The machine learning model is trained with speech of the user satisfying a noise threshold and collected during one or more communication sessions.
-
-
-
-