METHOD AND APPARATUS FOR AUDIO SUMMARIZATION

    公开(公告)号:US20170199934A1

    公开(公告)日:2017-07-13

    申请号:US14992700

    申请日:2016-01-11

    Applicant: Google Inc.

    CPC classification number: G06F16/638 G06F3/165 G10L25/51 G10L25/78

    Abstract: Summaries of audio or audio-video events are created from audio or audio-video recordings based on the needs of a particular user. The summarized events may have shorter timespans than the actual timespans of audio or audio-video recordings. Audio or audio-video recordings may be provided by one or more recording devices or sensors to a network, such as a cloud. A summarizer is provided in the network, and may include an audio marker, an audio enhancer, and an audio compiler. The audio marker tags segments of an audio or audio-video stream using one or more audio detectors based on user preferences. The audio enhancer may enhance the quality of tagged audio segments by enhancing desired sound features and suppressing undesired sound features. The audio compiler compiles the tagged audio segments based on event scores and generates audio or audio-video summaries for the user.

    Method and System for Detecting an Audio Event for Smart Home Devices
    4.
    发明申请
    Method and System for Detecting an Audio Event for Smart Home Devices 有权
    用于检测智能家居设备的音频事件的方法和系统

    公开(公告)号:US20160364963A1

    公开(公告)日:2016-12-15

    申请号:US14737678

    申请日:2015-06-12

    Applicant: Google Inc.

    Abstract: This application discloses a method implemented by an electronic device to detect a signature event (e.g., a baby cry event) associated with an audio feature (e.g., baby sound). The electronic device obtains a classifier model from a remote server. The classifier model is determined according to predetermined capabilities of the electronic device and ambient sound characteristics of the electronic device, and distinguishes the audio feature from a plurality of alternative features and ambient noises. When the electronic device obtains audio data, it splits the audio data to a plurality of sound components each associated with a respective frequency or frequency band and including a series of time windows. The electronic device further extracts a feature vector from the sound components, classifies the extracted feature vector to obtain a probability value according to the classifier model, and detects the signature event based on the probability value.

    Abstract translation: 本申请公开了一种由电子设备实现的用于检测与音频特征(例如婴儿声音)相关联的签名事件(例如,婴儿哭泣事件)的方法。 电子设备从远程服务器获取分类器模型。 根据电子设备的预定能力和电子设备的环境声音特性来确定分类器模型,并且将音频特征与多个替代特征和环境噪声区分开。 当电子设备获得音频数据时,它将音频数据分割成与各个频率或频带相关联并且包括一系列时间窗口的多个声音分量。 电子设备还从声音分量中提取特征向量,对所提取的特征向量进行分类,根据分类器模型获得概率值,并根据概率值检测签名事件。

    MULTI-MICROPHONE NEURAL NETWORK FOR SOUND RECOGNITION

    公开(公告)号:US20170325023A1

    公开(公告)日:2017-11-09

    申请号:US14988047

    申请日:2016-01-05

    Applicant: Google Inc.

    Abstract: A neural network is provided for recognition and enhancement of multi-channel sound signals received by multiple microphones, which need not be aligned in a linear array in a given environment. Directions and distances of sound sources may also be detected by the neural network without the need for a beamformer connected to the microphones. The neural network may be trained by knowledge gained from free-field array impulse responses obtained in an anechoic chamber, array impulse responses that model simulated environments of different reverberation times, and array impulse responses obtained in actual environments.

    Device specific multi-channel data compression

    公开(公告)号:US09875747B1

    公开(公告)日:2018-01-23

    申请号:US15211417

    申请日:2016-07-15

    Applicant: Google Inc.

    CPC classification number: G10L19/008 G10L19/0017 G10L25/30 G10L25/72

    Abstract: A sensor device may include a computing device in communication with multiple microphones. A neural network executing on the computing device may receive audio signals from each microphone. One microphone signal may serve as a reference signal. The neural network may extract differences in signal characteristics of the other microphone signals as compared to the reference signal. The neural network may combine these signal differences into a lossy compressed signal. The sensor device may transmit the lossy compressed signal and the lossless reference signal to a remote neural network executing in a cloud computing environment for decompression and sound recognition analysis.

Patent Agency Ranking