-
公开(公告)号:US20240256706A1
公开(公告)日:2024-08-01
申请号:US18102636
申请日:2023-01-27
发明人: Qutubuddin SAIFEE , Raghuram ANNADANA , Sunil Kashinath SHRIPAD , Manoj SINGHAL , Stephen Ray PALM
IPC分类号: G06F21/62 , G10L15/22 , G10L15/30 , G10L21/007 , G10L21/0216 , G10L21/028 , G10L21/034
CPC分类号: G06F21/6254 , G10L15/22 , G10L15/30 , G10L21/007 , G10L21/0216 , G10L21/028 , G10L21/034 , G10L2021/02082
摘要: A system includes a first module and a second module. The first module may be configured to perform operations including generating voice data based on an input audio, anonymizing the voice data by applying a first audio transformation, and transmitting the anonymized voice data to a first remote ASR module for generating speech recognition data. The second module may be configured to perform operations including separating the input audio into a first data and a second data, anonymizing the first data by applying a second audio transformation to the first data, generating an anonymized audio data by combining the anonymized first data and the second data, and transmitting the anonymized audio data to a second remote ASR module for generating speech recognition data.
-
公开(公告)号:US12015655B2
公开(公告)日:2024-06-18
申请号:US18039026
申请日:2021-01-18
发明人: Tommy Arngren , Tommy Falk , Andreas Kristensson , Peter Ökvist
IPC分类号: H04L65/403 , G10L15/02 , G10L15/22 , G10L21/034 , G10L21/0364
CPC分类号: H04L65/403 , G10L15/02 , G10L15/22 , G10L21/034 , G10L21/0364
摘要: A method performed by a system of a communication network includes obtaining digital representations of speech detected from communication devices connected to a teleconference, and receiving a request for a parallel discussion from a first of the communication devices with a subgroup of the communication devices. Further, the system sets up a parallel discussion group for the first communication device and the subgroup of communication devices, provides the digital representations of speech of the first communication device and the subgroup of communication devices only to the devices of the parallel discussion group so that each device of the parallel discussion group is able to play back the digital representations of speech of the other devices of the parallel discussion group, and provides the digital representations of speech of the plurality of communication devices except the first communication device and the subgroup of communication devices.
-
公开(公告)号:US11924367B1
公开(公告)日:2024-03-05
申请号:US17668297
申请日:2022-02-09
发明人: Jean-Marc Valin , Karim Helwani , Srikanth Venkata Tenneti , Erfan Soltanmohammadi , Mehmet Umut Isik , Richard Newman , Michael Mark Goodwin , Arvindh Krishnaswamy
IPC分类号: H04M3/00 , G10L21/0232 , G10L21/034 , G10L25/18 , H04S3/00 , G10L21/0208
CPC分类号: H04M3/002 , G10L21/0232 , G10L21/034 , G10L25/18 , H04S3/008 , G10L2021/02082 , H04S2400/01 , H04S2400/03
摘要: Joint noise and echo suppression may be performed for enhancing two-way audio communications. Audio data is captured at a communication device and audio data transmitted to the communication device from another communication device are used as input features to a trained machine learning model that uses the transmitted audio data as a reference signal to eliminate residual echo in the captured audio data when also suppressing noise in the captured audio data.
-
4.
公开(公告)号:US20240046950A1
公开(公告)日:2024-02-08
申请号:US17881355
申请日:2022-08-04
CPC分类号: G10L21/034 , G06T7/70 , G06F3/167 , G10L15/063 , G10L25/60 , G10L15/25 , G06T2207/30201
摘要: An electronic device includes an imager capturing one or more images of a subject engaging the electronic device and an audio input receiving acoustic signals having audible frequencies from the mouth of the subject engaging the electronic device. One or more processors determine from the one or more images of the subject whether the mouth of the subject is oriented on-axis relative to the audio input or off-axis relative to the audio input. The one or more processors adjust a gain of the audio input associated with a subset of the audible frequencies when the mouth of the subject is oriented off-axis relative to the audio input.
-
公开(公告)号:US11887618B2
公开(公告)日:2024-01-30
申请号:US17723316
申请日:2022-04-18
发明人: Junbin Liang
IPC分类号: G10L21/0364 , G10L21/0316 , H04M3/56 , G10L19/16 , G10L21/034 , G10L25/18 , G10L25/21 , G10L25/51 , G10L25/84
CPC分类号: G10L21/0364 , G10L19/167 , G10L21/034 , G10L21/0316 , G10L25/18 , G10L25/21 , G10L25/51 , G10L25/84 , H04M3/568
摘要: A call audio mixing processing method is provided. In the method, call audio streams from terminals of call members participating in a call are obtained. Voice analysis is performed on the call audio streams to determine voice activity corresponding to each of the terminals. The voice activity of the terminals indicate activity levels of the call members participating in the call. According to the voice activity of the terminals, respective voice adjustment parameters corresponding to the terminals are determined. According to the respective voice adjustment parameters corresponding to the terminals, the call audio streams of the terminals are adjusted. Further, mixing processing is performed on the adjusted call audio streams to obtain a mixed audio stream.
-
公开(公告)号:US20240031489A1
公开(公告)日:2024-01-25
申请号:US17871513
申请日:2022-07-22
申请人: Google LLC
发明人: Henrik Fahlberg Lundin , Alessio Bazzica , Esbjörn Dominique , Per Erik Daniel Johansson , Tomas Gunnarsson , Markus Lindroth , Karl Allan Tore Rudberg
IPC分类号: H04M3/56 , G10L21/0364 , G10L25/51 , G10L25/84 , G10L21/028 , G10L21/034 , G10L17/06
CPC分类号: H04M3/568 , G10L21/0364 , G10L25/51 , G10L25/84 , G10L21/028 , G10L21/034 , G10L17/06
摘要: Methods, systems, and apparatus for normalizing audio transmissions from multiple endpoints within a teleconference. A first audio transmission from a first participant of a teleconference can be received for presentation at the teleconference. The first audio transmission can be analyzed to classify one or more audio signatures of the first audio transmission as speech. A difference can be determined between the audio level of the one or more audio signatures and an audio level of second transmissions. Based on the difference, the first audio transmission can be normalized to adjust a gain of the first transmission. The transmission can be output to the teleconference.
-
公开(公告)号:US20230421620A1
公开(公告)日:2023-12-28
申请号:US18039026
申请日:2021-01-18
发明人: Tommy ARNGREN , Tommy FALK , Andreas KRISTENSSON , Peter ÖKVIST
IPC分类号: H04L65/403 , G10L21/034 , G10L15/02 , G10L15/22 , G10L21/0364
CPC分类号: H04L65/403 , G10L21/034 , G10L15/02 , G10L15/22 , G10L21/0364
摘要: A method performed by a system of a communication network includes obtaining digital representations of speech detected from communication devices connected to a teleconference, and receiving a request for a parallel discussion from a first of the communication devices with a subgroup of the communication devices. Further, the system sets up a parallel discussion group for the first communication device and the subgroup of communication devices, provides the digital representations of speech of the first communication device and the subgroup of communication devices only to the devices of the parallel discussion group so that each device of the parallel discussion group is able to play back the digital representations of speech of the other devices of the parallel discussion group, and provides the digital representations of speech of the plurality of communication devices except the first communication device and the subgroup of communication devices.
-
8.
公开(公告)号:US20230419984A1
公开(公告)日:2023-12-28
申请号:US18465070
申请日:2023-09-11
IPC分类号: G10L21/0364 , G10L21/034 , G10L25/30 , G10L25/21 , G10L25/18
CPC分类号: G10L21/0364 , G10L21/034 , G10L25/30 , G10L25/21 , G10L25/18
摘要: An apparatus for providing an estimate of a loudness of signal components of interest of an audio signal is provided. The apparatus has an input interface configured to receive a plurality of samples of the audio signal. Moreover, the apparatus has a neural network configured to receive as input values the plurality of samples of the audio signal or a plurality of derived values being derived from the plurality of samples of the audio signal, and configured to determine at least one output value from the plurality of input values, such that the at least one output value indicates the estimate of the loudness of the signal components of interest of the audio signal.
-
公开(公告)号:US20230410828A1
公开(公告)日:2023-12-21
申请号:US17845655
申请日:2022-06-21
申请人: Apple Inc.
发明人: Ramin Pishehvar , Mehrez Souden , Sean A. Ramprashad , Jason Wung , Ante Jukic , Joshua D. Atkins
IPC分类号: G10L21/0232 , G06V40/16 , G10L25/84 , G10L21/034 , G10L21/0364 , G10L15/25 , G10L15/06 , G10L15/22
CPC分类号: G10L21/0232 , G06V40/161 , G10L25/84 , G10L21/034 , G10L21/0364 , G10L15/25 , G10L15/063 , G10L15/22
摘要: Disclosed is a reference-less echo mitigation or cancellation technique. The technique enables suppression of echoes from an interference signal when a reference version of the interference signal conventionally used for echo mitigation may not be available. A first stage of the technique may use a machine learning model to model a target audio area surrounding a device so that a target audio signal estimated as originating from within the target audio area may be accepted. In contrast, audio signals such as playback of media content on a TV or other interfering signals estimated as originating from outside the target audio area may be suppressed. A second stage of the technique may be a level-based suppressor that further attenuates the residual echo from the output of the first stage based on an audio level threshold. Side information may be provided to adjust the target audio area or the audio level threshold.
-
公开(公告)号:US11843916B2
公开(公告)日:2023-12-12
申请号:US17167153
申请日:2021-02-04
发明人: Yonatan Wexler , Amnon Shashua , Tal Rosenwein , Roi Nathan
IPC分类号: H04R25/00 , G06F3/16 , G10L17/04 , G10L17/06 , G10L17/18 , G10L21/003 , G10L21/034 , G10L25/51 , H04R1/08 , G03B31/00 , G06F1/16 , G10L21/0272 , H04N7/18 , G10L17/00 , H04N5/38 , G10L15/26 , G06V20/10 , G06V40/10 , G06V40/16 , G06V40/20 , G06F18/21 , G06F18/25 , H04N23/51 , G06V10/80 , G06F18/00
CPC分类号: H04R25/407 , G03B31/00 , G06F1/163 , G06F1/1686 , G06F3/165 , G06F3/167 , G06F18/21 , G06F18/251 , G06V10/803 , G06V20/10 , G06V40/10 , G06V40/16 , G06V40/165 , G06V40/171 , G06V40/172 , G06V40/20 , G10L15/26 , G10L17/00 , G10L17/04 , G10L17/06 , G10L17/18 , G10L21/003 , G10L21/0272 , G10L21/034 , G10L25/51 , H04N5/38 , H04N7/185 , H04N23/51 , H04R1/08 , H04R25/405 , H04R25/45 , H04R25/505 , H04R25/554 , H04R25/558 , H04R25/60 , H04R25/606 , H04R25/65 , G06F18/00 , H04R2225/025 , H04R2225/41 , H04R2225/43 , H04R2225/55 , H04R2460/01 , H04R2460/13
摘要: A system for selectively amplifying audio signals may include a wearable camera configured to capture a plurality of images from an environment of a user and a microphone configured to capture sounds from an environment of the user. The system may also include a processor programmed to: receive the plurality of images captured by the camera; identify a representation of at least one recognized individual in at least one of the plurality of images; receive audio signals representative of the sounds captured by the microphone; cause selective conditioning of at least one audio signal received by the microphone from a region associated with the at least one recognized individual; and cause transmission of the at least one conditioned audio signal to a hearing interface device configured to provide sound to an ear of the user.
-
-
-
-
-
-
-
-
-