Exploiting visual information for enhancing audio signals via source separation and beamforming

    公开(公告)号:US10402651B2

    公开(公告)日:2019-09-03

    申请号:US15905442

    申请日:2018-02-26

    摘要: A system for exploiting visual information for enhancing audio signals via source separation and beamforming is disclosed. The system may obtain visual content associated with an environment of a user, and may extract, from the visual content, metadata associated with the environment. The system may determine a location of the user based on the extracted metadata. Additionally, the system may load, based on the location, an audio profile corresponding to the location of the user. The system may also load a user profile of the user that includes audio data associated with the user. Furthermore, the system may cancel, based on the audio profile and user profile, noise from the environment of the user. Moreover, the system may include adjusting, based on the audio profile and user profile, an audio signal generated by the user so as to enhance the audio signal during a communications session of the user.

    Acoustic enhancement by leveraging metadata to mitigate the impact of noisy environments

    公开(公告)号:US10170133B2

    公开(公告)日:2019-01-01

    申请号:US15683067

    申请日:2017-08-22

    摘要: A system for cloud acoustic enhancement is disclosed. In particular, the system may leverage metadata and cloud-computing network resources to mitigate the impact of noisy environments that may potentially interfere with user communications. In order to do so, the system may receive an audio stream including an audio signal associated with a user, and determine if the audio stream also includes an interference signal. The system may determine that the audio stream includes the interference signal if a portion of the audio stream correlates with metadata that identifies the interference signal. If the audio stream is determined to include the interference signal, the system may cancel the interference signal from the audio stream by utilizing the metadata and the cloud-computing network resources. Once the interference signal is cancelled, the system may transmit the audio stream including the audio signal associated with the user to an intended destination.

    ACOUSTIC ENVIRONMENT RECOGNIZER FOR OPTIMAL SPEECH PROCESSING

    公开(公告)号:US20180197558A1

    公开(公告)日:2018-07-12

    申请号:US15912230

    申请日:2018-03-05

    摘要: A system for providing an acoustic environment recognizer for optimal speech processing is disclosed. In particular, the system may utilize metadata obtained from various acoustic environments to assist in suppressing ambient noise interfering with a desired audio signal. In order to do so, the system may receive an audio stream including an audio signal associated with a user and including ambient noise obtained from an acoustic environment of the user. The system may obtain first metadata associated with the ambient noise, and may determine if the first metadata corresponds to second metadata in a profile for the acoustic environment. If the first metadata corresponds to the second metadata, the system may select a processing scheme for suppressing the ambient noise from the audio stream, and process the audio stream using the processing scheme. Once the audio stream is processed, the system may provide the audio stream to a destination.

    SENSOR ENHANCED SPEECH RECOGNITION
    86.
    发明申请

    公开(公告)号:US20180137348A1

    公开(公告)日:2018-05-17

    申请号:US15868546

    申请日:2018-01-11

    IPC分类号: G06K9/00

    摘要: A system for sensor enhanced speech recognition is disclosed. The system may obtain visual content or other content associated with a user and an environment of the user. Additionally, the system may obtain, from the visual content, metadata associated with the user and the environment of the user. The system may also include determining, based on the visual content and metadata, if the user is speaking. If the user is determined to be speaking, the system may obtain audio content associated with the user and the environment. The system may then adapt, based on the visual content, audio content, and metadata, one or more acoustic models that match the user and the environment. Once the one or more acoustic models are adapted and loaded, the system may enhance a speech recognition process or other process associated with the user.

    Pre-distortion system for cancellation of nonlinear distortion in mobile devices

    公开(公告)号:US09973633B2

    公开(公告)日:2018-05-15

    申请号:US14543261

    申请日:2014-11-17

    IPC分类号: H04M9/08 G10L21/0208

    CPC分类号: H04M9/082 G10L2021/02082

    摘要: A pre-distortion system for improved mobile device communications via cancellation of nonlinear distortion is disclosed. The pre-distortion system may transmit an acoustic signal from a network to a device, wherein the acoustic signal includes a linear signal and a nonlinear cancellation signal that cancels at least a portion of nonlinear distortions created once a loudspeaker in the device emits the linear signal. Thus, when a loudspeaker of a mobile device is operating and nonlinear distortions are generated by the loudspeaker or adjacent components of the mobile device in close proximity to the loudspeaker, the pre-distortion system may create one or more nonlinear cancellation signals in the network. The nonlinear cancellation signal may be combined with the linear signal sent to the loudspeaker to cancel the nonlinear distortion signal created by the loudspeaker emitting acoustic sounds from the linear signal. Thus, the nonlinear cancellation signal becomes a pre-distortion signal.

    Sensor enhanced speech recognition
    89.
    发明授权

    公开(公告)号:US09870500B2

    公开(公告)日:2018-01-16

    申请号:US14302137

    申请日:2014-06-11

    摘要: A system for sensor enhanced speech recognition is disclosed. The system may obtain visual content or other content associated with a user and an environment of the user. Additionally, the system may obtain, from the visual content, metadata associated with the user and the environment of the user. The system may also include determining, based on the visual content and metadata, if the user is speaking. If the user is determined to be speaking, the system may obtain audio content associated with the user and the environment. The system may then adapt, based on the visual content, audio content, and metadata, one or more acoustic models that match the user and the environment. Once the one or more acoustic models are adapted and loaded, the system may enhance a speech recognition process or other process associated with the user.

    ACOUSTIC ENVIRONMENT RECOGNIZER FOR OPTIMAL SPEECH PROCESSING
    90.
    发明申请
    ACOUSTIC ENVIRONMENT RECOGNIZER FOR OPTIMAL SPEECH PROCESSING 有权
    声音环境识别器进行最佳语音处理

    公开(公告)号:US20170076736A1

    公开(公告)日:2017-03-16

    申请号:US15362372

    申请日:2016-11-28

    摘要: A system for providing an acoustic environment recognizer for optimal speech processing is disclosed. In particular, the system may utilize metadata obtained from various acoustic environments to assist in suppressing ambient noise interfering with a desired audio signal. In order to do so, the system may receive an audio stream including an audio signal associated with a user and including ambient noise obtained from an acoustic environment of the user. The system may obtain first metadata associated with the ambient noise, and may determine if the first metadata corresponds to second metadata in a profile for the acoustic environment. If the first metadata corresponds to the second metadata, the system may select a processing scheme for suppressing the ambient noise from the audio stream, and process the audio stream using the processing scheme. Once the audio stream is processed, the system may provide the audio stream to a destination.

    摘要翻译: 公开了一种用于提供用于最佳语音处理的声学环境识别器的系统。 特别地,该系统可以利用从各种声学环境获得的元数据来帮助抑制干扰所需音频信号的环境噪声。 为了这样做,系统可以接收包括与用户相关联的音频信号的音频流,并且包括从用户的声学环境获得的环境噪声。 系统可以获得与环境噪声相关联的第一元数据,并且可以确定第一元数据是否对应于用于声学环境的简档中的第二元数据。 如果第一元数据对应于第二元数据,则系统可以选择用于从音频流抑制环境噪声的处理方案,并且使用处理方案处理音频流。 一旦音频流被处理,系统可以将音频流提供给目的地。