-
公开(公告)号:US20200312315A1
公开(公告)日:2020-10-01
申请号:US16368403
申请日:2019-03-28
Applicant: Apple Inc.
Inventor: Feipeng Li , Mehrez Souden , Joshua D. Atkins , John Bridle , Charles P. Clark , Stephen H. Shum , Sachin S. Kajarekar , Haiying Xia , Erik Marchi
IPC: G10L15/20
Abstract: An acoustic environment aware method for selecting a high quality audio stream during multi-stream speech recognition. A number of input audio streams are processed to determine if a voice trigger is detected, and if so a voice trigger score is calculated for each stream. An acoustic environment measurement is also calculated for each audio stream. The trigger score and acoustic environment measurement are combined for each audio stream, to select as a preferred audio stream the audio stream with the highest combined score. The preferred audio stream is output to an automatic speech recognizer. Other aspects are also described and claimed.
-
公开(公告)号:US11532306B2
公开(公告)日:2022-12-20
申请号:US17111132
申请日:2020-12-03
Applicant: Apple Inc.
Inventor: Yoon Kim , John Bridle , Joshua D. Atkins , Feipeng Li , Mehrez Souden
IPC: G10L15/22 , H04R1/40 , G10L15/08 , G10L15/04 , H04R3/00 , G10L15/30 , G10L15/18 , G10L15/28 , G10L21/0216 , G10L25/51 , H04R27/00
Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors, memory, and a plurality of microphones, sampling, at each of the plurality of microphones of the electronic device, an audio signal to obtain a plurality of audio signals; processing the plurality of audio signals to obtain a plurality of audio streams; and determining, based on the plurality of audio streams, whether any of the plurality of audio signals corresponds to a spoken trigger. The method further includes, in accordance with a determination that the plurality of audio signals corresponds to the spoken trigger, initiating a session of the digital assistant; and in accordance with a determination that the plurality of audio signals does not correspond to the spoken trigger, foregoing initiating a session of the digital assistant.
-
公开(公告)号:US10789041B2
公开(公告)日:2020-09-29
申请号:US14834194
申请日:2015-08-24
Applicant: Apple Inc.
Inventor: Yoon Kim , Thomas R. Gruber , John Bridle
Abstract: Systems and processes are disclosed for dynamically adjusting a speech trigger threshold, which can be used in triggering a virtual assistant. Audio input can be received via a microphone. The received audio input can be sampled, and a confidence level can be determined of whether the sampled audio input includes a portion of a spoken trigger. In response to the confidence level exceeding a threshold, a virtual assistant can be triggered to receive a user command from the audio input. The threshold can be dynamically adjusted in response to perceived events (e.g., events indicating a user may be more or less likely to initiate speech interactions, events indicating a trigger may be difficult to detect, events indicating a trigger was missed, etc.), thereby minimizing both missed triggers and false positive triggering events.
-
-