-
公开(公告)号:US10431204B2
公开(公告)日:2019-10-01
申请号:US15803584
申请日:2017-11-03
Applicant: Apple Inc.
Inventor: Matthias Paulik , Gunnar Evermann , Laurence S. Gillick
IPC: G06F17/27 , G10L15/06 , G10L15/02 , G10L25/33 , G10L15/197 , G06F16/9535 , G10L15/18 , G10L15/183
Abstract: Systems and processes are disclosed for discovering trending terms in automatic speech recognition. Candidate terms (e.g., words, phrases, etc.) not yet found in a speech recognizer vocabulary or having low language model probability can be identified based on trending usage in a variety of electronic data sources (e.g., social network feeds, news sources, search queries, etc.). When candidate terms are identified, archives of live or recent speech traffic can be searched to determine whether users are uttering the candidate terms in dictation or speech requests. Such searching can be done using open vocabulary spoken term detection to find phonetic matches in the audio archives. As the candidate terms are found in the speech traffic, notifications can be generated that identify the candidate terms, provide relevant usage statistics, identify the context in which the terms are used, and the like.
-
公开(公告)号:US10410637B2
公开(公告)日:2019-09-10
申请号:US15713276
申请日:2017-09-22
Applicant: Apple Inc.
Inventor: Matthias Paulik , Henry G. Mason , Jason A. Skinder
Abstract: Systems and processes for providing user-specific acoustic models are provided. In accordance with one example, a method includes, at an electronic device having one or more processors, receiving a plurality of speech inputs, each of the speech inputs associated with a same user of the electronic device; providing each of the plurality of speech inputs to a user-independent acoustic model, the user-independent acoustic model providing a plurality of speech results based on the plurality of speech inputs; initiating a user-specific acoustic model on the electronic device; and adjusting the user-specific acoustic model based on the plurality of speech inputs and the plurality of speech results.
-
公开(公告)号:US10332518B2
公开(公告)日:2019-06-25
申请号:US15677886
申请日:2017-08-15
Applicant: Apple Inc.
Inventor: Ashish Garg , Harry J. Saddler , Shweta Grampurohit , Robert A. Walker , Rushin N. Shah , Matthew S. Seigel , Matthias Paulik
Abstract: Speech recognition is performed on a received utterance to determine a plurality of candidate text representations of the utterance, including a primary text representation and one or more alternative text representations. Natural language processing is performed on the primary text representation to determine a plurality of candidate actionable intents, including a primary actionable intent and one or more alternative actionable intents. A result is determined based on the primary actionable intent. The result is provided to the user. A recognition correction trigger is detected. In response to detecting the recognition correction trigger, a set of alternative intent affordances and a set of alternative text affordances are concurrently displayed.
-
公开(公告)号:US10255907B2
公开(公告)日:2019-04-09
申请号:US14846650
申请日:2015-09-04
Applicant: Apple Inc.
Inventor: Udhyakumar Nallasamy , Sachin S. Kajarekar , Matthias Paulik , Matthew Seigel
Abstract: Systems and processes for automatic accent detection are provided. In accordance with one example, a method includes, at an electronic device with one or more processors and memory, receiving a user input, determining a first similarity between a representation of the user input and a first acoustic model of a plurality of acoustic models, and determining a second similarity between the representation of the user input and a second acoustic model of the plurality of acoustic models. The method further includes determining whether the first similarity is greater than the second similarity. In accordance with a determination that the first similarity is greater than the second similarity, the first acoustic model may be selected; and in accordance with a determination that the first similarity is not greater than the second similarity, the second acoustic model may be selected.
-
公开(公告)号:US10186254B2
公开(公告)日:2019-01-22
申请号:US14846667
申请日:2015-09-04
Applicant: Apple Inc.
Inventor: Shaun E. Williams , Henry G. Mason , Mahesh Krishnamoorthy , Matthias Paulik , Neha Agrawal , Sachin S. Kajarekar , Selen Uguroglu , Ali S. Mohamed
Abstract: The present disclosure generally relates to context-based endpoint detection in user speech input. A method for identifying an endpoint of a spoken request by a user may include receiving user input of natural language speech including one or more words; identifying at least one context associated with the user input; generating a probability, based on the at least one context associated with the user input, that a location in the user input is an endpoint; determining whether the probability is greater than a threshold; and in accordance with a determination that the probability is greater than the threshold, identifying the location in the user input as the endpoint.
-
公开(公告)号:US11837237B2
公开(公告)日:2023-12-05
申请号:US18107289
申请日:2023-02-08
Applicant: Apple Inc.
Inventor: Matthias Paulik , Henry G. Mason , Jason A. Skinder
CPC classification number: G10L17/04 , G10L15/063 , G10L15/07 , G10L15/30 , G10L15/02 , G10L15/187 , G10L2015/0635 , G10L2015/0636
Abstract: Systems and processes for providing user-specific acoustic models are provided. In accordance with one example, a method includes, at an electronic device having one or more processors, receiving a plurality of speech inputs, each of the speech inputs associated with a same user of the electronic device; providing each of the plurality of speech inputs to a user-independent acoustic model, the user-independent acoustic model providing a plurality of speech results based on the plurality of speech inputs; initiating a user-specific acoustic model on the electronic device; and adjusting the user-specific acoustic model based on the plurality of speech inputs and the plurality of speech results.
-
公开(公告)号:US10755703B2
公开(公告)日:2020-08-25
申请号:US15713503
申请日:2017-09-22
Applicant: Apple Inc.
Inventor: Nicolas Zeitlin , Matthias Paulik , Henry G. Mason , Karric Kwong , Sinan Akay , Saravana Kumar Rathinam , Anumita Biswas
Abstract: Systems and processes for performing a task with a digital assistant are provided. In accordance with one example, a method includes, at an electronic device having one or more processors, receiving a natural-language input; determining, based on the natural-language input, a first task and first usefulness score associated with the first task; receiving, from another electronic device, a second task and second usefulness score associated with the second task; determining whether the first usefulness score is higher than the second usefulness score; in accordance with a determination that the first usefulness score is higher than the second usefulness score: performing the first task determined by the electronic device; and providing an output indicating whether the first task has been performed; and in accordance with a determination that the second usefulness score is higher than the first usefulness score: performing the second task received from the another electronic device; and providing an output indicating whether the second task has been performed.
-
公开(公告)号:US10741181B2
公开(公告)日:2020-08-11
申请号:US16412137
申请日:2019-05-14
Applicant: Apple Inc.
Inventor: Ashish Garg , Harry J. Saddler , Shweta Grampurohit , Robert A. Walker , Rushin N. Shah , Matthew S. Seigel , Matthias Paulik
Abstract: Speech recognition is performed on a received utterance to determine a plurality of candidate text representations of the utterance, including a primary text representation and one or more alternative text representations. Natural language processing is performed on the primary text representation to determine a plurality of candidate actionable intents, including a primary actionable intent and one or more alternative actionable intents. A result is determined based on the primary actionable intent. The result is provided to the user. A recognition correction trigger is detected. In response to detecting the recognition correction trigger, a set of alternative intent affordances and a set of alternative text affordances are concurrently displayed.
-
公开(公告)号:US09972304B2
公开(公告)日:2018-05-15
申请号:US15266949
申请日:2016-09-15
Applicant: Apple Inc.
Inventor: Matthias Paulik , Henry G. Mason , Matthew S. Seigel
Abstract: Systems and processes for evaluating embedded personalized systems are provided. In one example process, instructions that define an experiment associated with a personalized speech recognition system can be received. The instructions can define one or more experimental parameters. In accordance with the received instructions, a second personalized speech recognition system can be generated based on the personalized speech recognition system and the one or more experimental parameters. Additionally, the plurality of user speech samples can be processed using the second personalized speech recognition system to generate a plurality of speech recognition results and a plurality of accuracy scores corresponding to the plurality of speech recognition results. Second instructions can be received based on the plurality of accuracy scores. In accordance with the second instructions, the second speech recognition system can be activated.
-
公开(公告)号:US09818400B2
公开(公告)日:2017-11-14
申请号:US14839835
申请日:2015-08-28
Applicant: Apple Inc.
Inventor: Matthias Paulik , Gunnar Evermann , Laurence S. Gillick
CPC classification number: G10L15/063 , G06F17/30867 , G10L15/02 , G10L15/1815 , G10L15/183 , G10L15/197 , G10L25/33
Abstract: Systems and processes are disclosed for discovering trending terms in automatic speech recognition. Candidate terms (e.g., words, phrases, etc.) not yet found in a speech recognizer vocabulary or having low language model probability can be identified based on trending usage in a variety of electronic data sources (e.g., social network feeds, news sources, search queries, etc.). When candidate terms are identified, archives of live or recent speech traffic can be searched to determine whether users are uttering the candidate terms in dictation or speech requests. Such searching can be done using open vocabulary spoken term detection to find phonetic matches in the audio archives. As the candidate terms are found in the speech traffic, notifications can be generated that identify the candidate terms, provide relevant usage statistics, identify the context in which the terms are used, and the like.
-
-
-
-
-
-
-
-
-