-
公开(公告)号:US11069336B2
公开(公告)日:2021-07-20
申请号:US16048043
申请日:2018-07-27
Applicant: Apple Inc.
Inventor: Devang K. Naik
IPC: G10L13/08 , G10L15/187
Abstract: Systems and methods are provided for associating a phonetic pronunciation with a name by receiving the name, mapping the name to a plurality of monosyllabic components that are combinable to construct the phonetic pronunciation of the name, receiving a user input to select one or more of the plurality, and combining the selected one or more of the plurality of monosyllabic components to construct the phonetic pronunciation of the name.
-
公开(公告)号:US20200327887A1
公开(公告)日:2020-10-15
申请号:US16380504
申请日:2019-04-10
Applicant: Apple Inc.
Inventor: Sarmad Aziz Malik , Charles P. Clark , Devang K. Naik , Srikanth Vishnubhotla
Abstract: Audio signals produced by microphones can be processed to remove echo and reverberation. The processed signals can be mapped to each other with adaptively estimated impulse responses. One or more of the processed signals, one or more of the mapped signals, and one or more of the impulse responses can be fed to an automatic speech recognizer (ASR) having a deep neural network (DNN), to train the DNN or recognize speech in the input audio signals. Other aspects are described and claimed.
-
公开(公告)号:US10296160B2
公开(公告)日:2019-05-21
申请号:US14099776
申请日:2013-12-06
Applicant: Apple Inc.
Inventor: Rushin N. Shah , Devang K. Naik
IPC: G06F17/27 , G06F3/0481 , G06F15/18
Abstract: Systems and processes are disclosed for virtual assistant request recognition using live usage data and data relating to future events. User requests that are received but not recognized can be used to generate candidate request templates. A count can be associated with each candidate request template and can be incremented each time a matching candidate request template is received. When a count reaches a threshold level, the corresponding candidate request template can be used to train a virtual assistant to recognize and respond to similar user requests in the future. In addition, data relating to future events can be mined to extract relevant information that can be used to populate both recognized user request templates and candidate user request templates. Populated user request templates (e.g., whole expected utterances) can then be used to recognize user requests and disambiguate user intent as future events become relevant.
-
公开(公告)号:US10079014B2
公开(公告)日:2018-09-18
申请号:US15643741
申请日:2017-07-07
Applicant: Apple Inc.
Inventor: Devang K. Naik
IPC: G10L15/00 , G10L15/04 , G10L15/26 , G10L15/06 , G10L15/18 , G10L21/00 , G10L25/00 , G06F17/27 , G06F17/21 , G10L15/187 , G10L15/30 , G10L15/02
CPC classification number: G10L15/187 , G10L15/30 , G10L2015/025 , G10L2015/0633
Abstract: A speech recognition system uses, in one embodiment, an extended phonetic dictionary that is obtained by processing words in a user's set of databases, such as a user's contacts database, with a set of pronunciation guessers. The speech recognition system can use a conventional phonetic dictionary and the extended phonetic dictionary to recognize speech inputs that are user requests to use the contacts database, for example, to make a phone call, etc. The extended phonetic dictionary can be updated in response to changes in the contacts database, and the set of pronunciation guessers can include pronunciation guessers for a plurality of locales, each locale having its own pronunciation guesser.
-
公开(公告)号:US20170365249A1
公开(公告)日:2017-12-21
申请号:US15188861
申请日:2016-06-21
Applicant: Apple Inc.
Inventor: Sorin V. Dusan , Devang K. Naik , Sachin S. Kajarekar
IPC: G10L15/05 , G10L21/0208 , G10L15/30 , G10L25/21 , H04R1/10
CPC classification number: G10L15/05 , G10L15/30 , G10L25/21 , G10L25/78 , H04R1/1016 , H04R3/005 , H04R2201/403 , H04R2410/01 , H04R2420/07 , H04R2430/20
Abstract: A method of performing automatic speech recognition (ASR) using end-pointing markers generated using accelerometer-based voice activity detector starts with a voice activity detector (VAD) generating an accelerometer VAD output (VADa) based on data output by at least one accelerometer that is included in at least one earbud. The at least one accelerometer to detect vibration of the user's vocal chords. A voice processor detects a speech signal based on acoustic signals from at least one microphone. An end-pointer generates the end-pointing markers based on the VADa output and an ASR engine performs ASR on the speech signal based on the end-pointing markers. Other embodiments are also described.
-
公开(公告)号:US09697822B1
公开(公告)日:2017-07-04
申请号:US14263869
申请日:2014-04-28
Applicant: Apple Inc.
Inventor: Devang K. Naik , Onur E. Tackin
IPC: G10L15/06 , G10L15/065 , G10L15/07
CPC classification number: G10L15/063 , G10L15/065 , G10L15/07 , G10L15/22 , G10L15/26 , G10L15/30 , G10L2015/223
Abstract: A method for updating an adaptive speech recognition model is provided. In some implementations, the method is performed at a communications device including one or more processors and memory storing instructions for execution by the one or more processors. The method includes determining that a first user of a first mobile communication device is engaged in a call over a communications network and providing an adaptive speech recognition model The method also includes analyzing an outbound audio channel of the first mobile communication device to obtain a call audio signal corresponding to audio input from one or more microphones of the first mobile communication device and updating the adaptive speech recognition model with training data derived from the call audio signal.
-
公开(公告)号:US09668121B2
公开(公告)日:2017-05-30
申请号:US14835540
申请日:2015-08-25
Applicant: Apple Inc.
Inventor: Devang K. Naik , Philippe P. Piernot
Abstract: Techniques for providing reminders based on social interactions between users of electronic devices are described. Social reminders can be set to trigger based on social interactions of users. For example, a user may request to be reminded to discuss a certain discussion topic with a particular phonebook contact, when the user next encounters the contact.
-
公开(公告)号:US11620999B2
公开(公告)日:2023-04-04
申请号:US17123428
申请日:2020-12-16
Applicant: Apple Inc.
Inventor: Pranay Dighe , Erik Marchi , Srikanth Vishnubhotla , Sachin Kajarekar , Devang K. Naik
Abstract: An example process includes: receiving an audio stream; determining a plurality of acoustic representations of the audio stream, where each acoustic representation of the plurality of acoustic representations corresponds to a respective frame of the audio stream; obtaining a respective plurality of scores indicating whether each respective frame of the audio stream is directed to an electronic device, where the obtaining includes: determining, using a triggering model operating on the electronic device, for each acoustic representation, a score indicating whether the respective frame of the audio stream is directed to the electronic device; determining, based on the respective plurality of scores, a likelihood that the audio stream is directed to the electronic device; determining whether the likelihood is above or below a threshold; and in response to determining that the likelihood is below the threshold, ceasing to process the audio stream.
-
公开(公告)号:US11290834B2
公开(公告)日:2022-03-29
申请号:US16880249
申请日:2020-05-21
Applicant: Apple Inc.
Inventor: Sarmad Aziz Malik , Sreeneel Maddika , Devang K. Naik
Abstract: Systems and processes for operating an intelligent automated assistant are provided. An examples process of operating an intelligent automated assistant includes, at an electronic device with one or more processors and memory, receiving audio input, determining a direct-to-reverberant energy ratio based on the audio input, and determining a head pose of a user based on the direct-to-reverberant energy ratio.
-
公开(公告)号:US10187440B2
公开(公告)日:2019-01-22
申请号:US15167898
申请日:2016-05-27
Applicant: APPLE INC.
Inventor: Devang K. Naik , Justin G. Binder
Abstract: In some implementations, a user device can personalize a media stream by converting notifications into audio speech data and presenting the audio speech data at locations within the media stream that do not interrupt the enjoyment of the media stream by the user. In some implementations, the user device can receive notifications from various communication services, applications installed on the user device, and/or other sources, determine information describing the notifications, and present the information to the user using the audio speech data. In some implementations, the user device can generate personalized notifications based on the media stream and/or media items selected by the user. The user device can generate personalized notifications based on the user's context (e.g., environment, location, activity, etc.). The personalized notifications can then be presented to the user using audio speech data at appropriate locations in the media stream.
-
-
-
-
-
-
-
-
-