-
公开(公告)号:US12165644B2
公开(公告)日:2024-12-10
申请号:US18459982
申请日:2023-09-01
Applicant: Sonos, Inc.
Inventor: Joachim Fainberg , Daniele Giacobello , Klaus Hartung
Abstract: Systems and methods for media playback via a media playback system include capturing sound data via a network microphone device and identifying a candidate wake word in the sound data. Based on identification of the candidate wake word in the sound data, the system selects a first wake-word engine from a plurality of wake-word engines. Via the first wake-word engine, the system analyzes the sound data to detect a confirmed wake word, and, in response to detecting the confirmed wake word, transmits a voice utterance of the sound data to one or more remote computing devices associated with a voice assistant service.
-
公开(公告)号:US20230395088A1
公开(公告)日:2023-12-07
申请号:US18313013
申请日:2023-05-05
Applicant: Sonos, Inc.
Inventor: Daniele Giacobello
IPC: G10L21/02 , G10K11/178 , H04M9/08 , G10L21/0208
CPC classification number: G10L21/02 , G10K11/178 , H04M9/082 , G10L21/0208 , H04R27/00
Abstract: Example techniques involve noise-robust acoustic echo cancellation. An example implementation may involve causing one or more speakers of the playback device to play back audio content and while the audio content is playing back, capturing, via the one or more microphones, audio within an acoustic environment that includes the audio playback. The example implementation may involve determining measured and reference signals in the STFT domain. During each nth iteration of an acoustic echo canceller (AEC): the implementation may involve determining a frame of an output signal by generating a frame of a model signal by passing a frame of the reference signal through an instance of an adaptive filter and then redacting the nth frame of the model signal from an nth frame of the measured signal. The implementation may further involve determining an instance of the adaptive filter for a next iteration of the AEC.
-
公开(公告)号:US20230186923A1
公开(公告)日:2023-06-15
申请号:US18060176
申请日:2022-11-30
Applicant: Sonos, Inc.
Inventor: Aaron Jones , Saeed Bagheri Sereshki , Daniele Giacobello
Abstract: Systems and methods for audio processing include capturing sound data via at least one microphone of a network microphone device (NMD) and determining whether the captured sound includes voice activity. While in a first stage, the NMD forgoes spatial processing of the captured sound data. If the NMD determines that the detected sound includes voice activity, the NMD transitions to a second stage. In this second stage, the NMD spatially processes the detected sound to produce filtered sound data and detects a wake word. After detecting the wake word, the NMD may determine an action to be performed based on the captured sound data.
-
4.
公开(公告)号:US20200321021A1
公开(公告)日:2020-10-08
申请号:US16907953
申请日:2020-06-22
Applicant: Sonos, Inc.
Inventor: Saeed Bagheri Sereshki , Daniele Giacobello
Abstract: Systems and methods for suppressing noise and detecting voice input in a multi-channel audio signal captured by two or more network microphone devices include receiving an instruction to process one or more audio signals captured by a first network microphone device and after receiving the instruction (i) disabling at least a first microphone of a plurality of microphones of a second network microphone device, (ii) capturing a first audio signal via a second microphone of the plurality of microphones, (iii) receiving over a network interface of the second network microphone device a second audio signal captured via at least a third microphone of the first network microphone device, (iv) using estimated noise content to suppress first and second noise content in the first and second audio signals, (v) combining the suppressed first and second audio signals into a third audio signal, and (vi) determining that the third audio signal includes a voice input comprising a wake word.
-
5.
公开(公告)号:US10692518B2
公开(公告)日:2020-06-23
申请号:US16147710
申请日:2018-09-29
Applicant: Sonos, Inc.
Inventor: Saeed Bagheri Sereshki , Daniele Giacobello
IPC: G10L21/0208 , G10L25/84 , G10L21/0232 , G10L15/22 , H04R1/40 , H04R3/00 , G10L15/08
Abstract: Systems and methods for suppressing noise and detecting voice input in a multi-channel audio signal captured by two or more network microphone devices include receiving an instruction to process one or more audio signals captured by a first network microphone device and after receiving the instruction (i) disabling at least a first microphone of a plurality of microphones of a second network microphone device, (ii) capturing a first audio signal via a second microphone of the plurality of microphones, (iii) receiving over a network interface of the second network microphone device a second audio signal captured via at least a third microphone of the first network microphone device, (iv) using estimated noise content to suppress first and second noise content in the first and second audio signals, (v) combining the suppressed first and second audio signals into a third audio signal, and (vi) determining that the third audio signal includes a voice input comprising a wake word.
-
公开(公告)号:US20190096419A1
公开(公告)日:2019-03-28
申请号:US15717621
申请日:2017-09-27
Applicant: Sonos, Inc.
Inventor: Daniele Giacobello
IPC: G10L21/02 , G10K11/178 , H04R27/00
CPC classification number: G10L21/02 , G10K11/178 , G10K2210/3012 , G10K2210/3028 , G10K2210/505 , G10L21/0208 , G10L21/0232 , G10L2021/02087 , H04M9/082 , H04R3/005 , H04R3/12 , H04R27/00 , H04R29/007 , H04R2227/003 , H04R2227/005 , H04R2420/03 , H04R2420/07 , H04R2430/23
Abstract: Example techniques involve noise-robust acoustic echo cancellation. An example implementation may involve causing one or more speakers of the playback device to play back audio content and while the audio content is playing back, capturing, via the one or more microphones, audio within an acoustic environment that includes the audio playback. The example implementation may involve determining measured and reference signals in the STFT domain. During each nth iteration of an acoustic echo canceller (AEC): the implementation may involve determining a frame of an output signal by generating a frame of a model signal by passing a frame of the reference signal through an instance of an adaptive filter and then redacting the nth frame of the model signal from an nth frame of the measured signal. The implementation may further involve determining an instance of the adaptive filter for a next iteration of the AEC.
-
公开(公告)号:US12217765B2
公开(公告)日:2025-02-04
申请号:US18313013
申请日:2023-05-05
Applicant: Sonos, Inc.
Inventor: Daniele Giacobello
IPC: G10L21/02 , G10K11/178 , G10L21/0208 , G10L21/0232 , H04M9/08 , H04R3/00 , H04R3/12 , H04R27/00 , H04R29/00
Abstract: Example techniques involve noise-robust acoustic echo cancellation. An example implementation may involve causing one or more speakers of the playback device to play back audio content and while the audio content is playing back, capturing, via the one or more microphones, audio within an acoustic environment that includes the audio playback. The example implementation may involve determining measured and reference signals in the STFT domain. During each nth iteration of an acoustic echo canceller (AEC): the implementation may involve determining a frame of an output signal by generating a frame of a model signal by passing a frame of the reference signal through an instance of an adaptive filter and then redacting the nth frame of the model signal from an nth frame of the measured signal. The implementation may further involve determining an instance of the adaptive filter for a next iteration of the AEC.
-
公开(公告)号:US12217748B2
公开(公告)日:2025-02-04
申请号:US17532744
申请日:2021-11-22
Applicant: Sonos, Inc.
Inventor: Klaus Hartung , Daniele Giacobello
Abstract: Disclosed herein are example techniques to identify a voice service to process a voice input. An example implementation may involve a network microphone device (NMD) receiving, via a microphone, voice data indicating a voice input. The NMD may identify, from among multiple voice services registered to a media playback system, a voice service to process the voice input and cause, via a network interface, the identified voice service to process the voice input.
-
公开(公告)号:US11769511B2
公开(公告)日:2023-09-26
申请号:US18060176
申请日:2022-11-30
Applicant: Sonos, Inc.
Inventor: Aaron Jones , Saeed Bagheri Sereshki , Daniele Giacobello
Abstract: Systems and methods for audio processing include capturing first sound data via at least one microphone of a network microphone device (NMD) and determining, via a voice activity detection process, that the first sound data does not include voice activity. The first sound data is stored in a buffer, and the NMD forgoes spatial processing of the first sound data. The NMD can capture second sound data and determine, via the voice activity process, that the second sound data includes voice activity. The NMD spatially processes the sound data to produce filtered sound data. The NMD detects a wake word based on data in the buffer. After detecting the wake word, the NMD may determine an action to be performed based on the data in the buffer.
-
10.
公开(公告)号:US11688419B2
公开(公告)日:2023-06-27
申请号:US18045360
申请日:2022-10-10
Applicant: Sonos, Inc.
Inventor: Saeed Bagheri Sereshki , Daniele Giacobello
IPC: G10L21/0208 , G10L25/84 , G10L15/08 , G10L15/22 , G10L21/0232 , H04R1/40 , H04R3/00
CPC classification number: G10L25/84 , G10L15/08 , G10L15/22 , G10L21/0232 , H04R1/406 , H04R3/005 , G10L21/0208 , G10L2015/088
Abstract: Systems and methods for suppressing noise and detecting voice input in a multi-channel audio signal captured by two or more network microphone devices include receiving an instruction to process one or more audio signals captured by a first network microphone device and after receiving the instruction (i) disabling at least a first microphone of a plurality of microphones of a second network microphone device, (ii) capturing a first audio signal via a second microphone of the plurality of microphones, (iii) receiving over a network interface of the second network microphone device a second audio signal captured via at least a third microphone of the first network microphone device, (iv) using estimated noise content to suppress first and second noise content in the first and second audio signals, (v) combining the suppressed first and second audio signals into a third audio signal, and (vi) determining that the third audio signal includes a voice input comprising a wake word.
-
-
-
-
-
-
-
-
-