Wakeword and acoustic event detection

    公开(公告)号:US11043218B1

    公开(公告)日:2021-06-22

    申请号:US16452964

    申请日:2019-06-26

    Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.

    Optimizing background tasks based on forecast data

    公开(公告)号:US11003491B1

    公开(公告)日:2021-05-11

    申请号:US16138413

    申请日:2018-09-21

    Abstract: Techniques for optimizing background tasks based on forecast data are described. Customer workloads may be monitored by a local monitor in a first time period. Future customer workloads in a second time period following the first time period may be forecast based at least on the customer workloads using a local model. A background availability may be determined based at least on the future customer workloads. Execution of at least one background workload may be scheduled to use the background availability during the second time period. The local monitor may then cause the execution of the at least one background workload.

    Emotion detection using speaker baseline

    公开(公告)号:US11545174B2

    公开(公告)日:2023-01-03

    申请号:US17178844

    申请日:2021-02-18

    Abstract: Described herein is a system for emotion detection in audio data using a speaker's baseline. The baseline may represent a user's speaking style in a neutral emotional state. The system is configured to compare the user's baseline with input audio representing speech from the user to determine a emotion of the user. The system may store multiple baselines for the user, each associated with a different context (e.g., environment, activity, etc.), and select one of the baselines to compare with the input audio based on the contextual situation.

    EMOTION DETECTION USING SPEAKER BASELINE

    公开(公告)号:US20210249035A1

    公开(公告)日:2021-08-12

    申请号:US17178844

    申请日:2021-02-18

    Abstract: Described herein is a system for emotion detection in audio data using a speaker's baseline. The baseline may represent a user's speaking style in a neutral emotional state. The system is configured to compare the user's baseline with input audio representing speech from the user to determine a emotion of the user. The system may store multiple baselines for the user, each associated with a different context (e.g., environment, activity, etc.), and select one of the baselines to compare with the input audio based on the contextual situation.

    USER PRESENCE DETECTION
    19.
    发明申请

    公开(公告)号:US20210027798A1

    公开(公告)日:2021-01-28

    申请号:US17022197

    申请日:2020-09-16

    Abstract: A speech-capture device can capture audio data during wakeword monitoring and use the audio data to determine if a user is present nearby the device, even if no wakeword is spoken. Audio such as speech, human originating sounds (e.g., coughing, sneezing), or other human related noises (e.g., footsteps, doors closing) can be used to detect audio. Audio frames are individually scored as to whether a human presence is detected in the particular audio frames. The scores are then smoothed relative to nearby frames to create a decision for a particular frame. Presence information can then be sent according to a periodic schedule to a remote device to create a presence “heartbeat” that regularly identifies whether a user is detected proximate to a speech-capture device.

    Audio event detection
    20.
    发明授权

    公开(公告)号:US10803885B1

    公开(公告)日:2020-10-13

    申请号:US16023923

    申请日:2018-06-29

    Abstract: An audio event detection system that processes audio data into audio feature data and processes the audio feature data using pre-configured candidate interval lengths to identify top candidate regions of the feature data that may include an audio event. The feature data from the top candidate regions are then scored by a classifier, where the score indicates a likelihood that the candidate region corresponds to a desired audio event. The scores are compared to a threshold, and if the threshold is satisfied, the top scoring candidate region is determined to include an audio event.

Patent Agency Ranking