-
公开(公告)号:US20180196683A1
公开(公告)日:2018-07-12
申请号:US15863523
申请日:2018-01-05
申请人: Apple Inc.
发明人: Carey E. RADEBAUGH , Brandon J. NEWENDORP , Corey J. PETERSON , Rohit DASARI , Trungtin TRAN , Vineet KHOSLA
CPC分类号: G06F9/453 , G06F3/0482 , G06F3/0488 , G06F3/167 , G06F16/245 , G06F16/3329 , G06F16/951 , G10L15/1815 , G10L15/187 , G10L15/265
摘要: Systems and processes for application integration with a digital assistant are provided. In accordance with one example, a method includes receiving an audio input including a natural-language user input and identifying an intent object of a set of intent objects. The intent object may be derived from the natural-language user input. The method further includes identifying a software application associated with the intent object of the set of intent objects, providing the intent object to the software application to cause the software application to perform a task associated with the intent object, receiving a result response indicating whether the task was successfully performed, and providing an output indicating whether the task was performed.
-
公开(公告)号:US10008197B2
公开(公告)日:2018-06-26
申请号:US15332000
申请日:2016-10-24
申请人: FUJITSU LIMITED
发明人: Shoji Hayakawa
IPC分类号: G10L15/16 , G10L15/14 , G10L15/187 , G10L15/02 , G10L15/08
CPC分类号: G10L15/02 , G10L15/142 , G10L15/16 , G10L15/187 , G10L2015/022 , G10L2015/025 , G10L2015/088
摘要: A keyword detector includes a processor configured to calculate a feature vector for each frame from a speech signal, input the feature vector for each frame to a DNN to calculate a first output probability for each triphone according to a sequence of phonemes contained in a predetermined keyword and a second output probability for each monophone, for each of at least one state of an HMM, calculate a first likelihood representing the probability that the predetermined keyword is uttered in the speech signal by applying the first output probability to the HMM, calculate a second likelihood for the most probable phoneme string in the speech signal by applying the second output probability to the HMM, and determine whether the keyword is to be detected on the basis of the first likelihood and the second likelihood.
-
公开(公告)号:US09972342B2
公开(公告)日:2018-05-15
申请号:US15352641
申请日:2016-11-16
发明人: Hiroshi Furuta , Eiiti Hosono
IPC分类号: G10L25/72 , G10L15/187 , G10L15/30 , G10L25/51
CPC分类号: G10L25/72 , G10L15/187 , G10L15/30 , G10L25/51
摘要: A reception unit receives a speech signal from another terminal device. A reproduction unit reproduces the speech signal received in the reception unit. A processing unit performs a speech recognition process on the speech signal reproduced in the reception unit, based on a speech recognition model of a user using the terminal device. A transmission unit transmits a result of the speech recognition process in the processing unit to another terminal device.
-
公开(公告)号:US09922643B2
公开(公告)日:2018-03-20
申请号:US14580331
申请日:2014-12-23
申请人: NICE-SYSTEMS LTD
发明人: Maor Nissan , Ronny Bretter
CPC分类号: G10L15/063 , G10L15/187 , G10L2015/025 , G10L2015/0638
摘要: A method for adapting a phonetic dictionary for peculiarities of a speech of an at least one speaker, comprising generating search pronunciations for a search term, retrieving audio sections from an audio database for each search pronunciation, audibly presenting to a person the audio sections of the speech of the at least one speaker, and updating the phonetic dictionary based on acceptability of the audio sections determined from judgments by the person regarding intelligibility of the audio sections in audibly pronouncing the provided at least one word, wherein the method is performed on an at least one computerized apparatus configured to perform the method.
-
公开(公告)号:US20180068653A1
公开(公告)日:2018-03-08
申请号:US15260021
申请日:2016-09-08
申请人: Intel IP Corporation
IPC分类号: G10L15/14 , G10L15/02 , G10L25/84 , G10L15/187
CPC分类号: G10L15/142 , G10L15/02 , G10L15/08 , G10L15/14 , G10L15/16 , G10L15/187 , G10L15/22 , G10L25/84 , G10L2015/025
摘要: A system, article, and method include techniques of automatic speech recognition using posterior confidence scores.
-
公开(公告)号:US09911409B2
公开(公告)日:2018-03-06
申请号:US15216121
申请日:2016-07-21
发明人: Seokjin Hong
CPC分类号: G10L15/063 , G10L15/16 , G10L15/187 , G10L15/19 , G10L2015/027
摘要: A speech recognition apparatus includes a processor configured to recognize a user's speech using any one or combination of two or more of an acoustic model, a pronunciation dictionary including primitive words, and a language model including primitive words; and correct word spacing in a result of speech recognition based on a word-spacing model.
-
公开(公告)号:US20180053510A1
公开(公告)日:2018-02-22
申请号:US15557897
申请日:2016-03-11
申请人: TRINT LIMITED
发明人: JEFFREY KOFMAN , MARK BOAS , MARK PANAGHISTON , LAURIAN GRIDINOC
IPC分类号: G10L15/26 , G06F17/30 , G06F17/27 , G10L15/187
CPC分类号: G10L15/265 , G06F16/685 , G06F17/2765 , G10L15/187 , G11B27/031 , G11B27/322 , G11B27/34
摘要: A media generating and editing system that generates audio playback in alignment with text that has been automatically transcribed from the audio. A transcript data file that includes a plurality of text words transcribed from audio words included in the audio data is stored. Timing data is paired with the text words indicating locations in the audio data of the corresponding audio words from which the text words are transcribed. The audio data is provided for playback at a user device. The text words are displayed on a display screen at a user device and a visual marker is displayed on the display screen to indicate the text words on the display screen in time alignment with the audio playback of the corresponding audio words at the user device. The text words in the transcript data file are amended in response to inputs from the user device.
-
公开(公告)号:US09899040B2
公开(公告)日:2018-02-20
申请号:US13662125
申请日:2012-10-26
申请人: Elwha LLC
IPC分类号: G10L15/06 , G10L99/00 , G06F17/28 , G10L15/065 , G10L15/22 , G10L15/187
CPC分类号: G10L99/00 , G06F17/28 , G06F17/289 , G10L15/06 , G10L15/063 , G10L15/065 , G10L15/187 , G10L2015/227 , G10L2015/228
摘要: Computationally implemented methods and systems include managing adaptation data, wherein the adaptation data is correlated to at least one aspect of speech of a particular party, facilitating transmission of the adaptation data to a target device, in response to an indicator related to a speech-facilitated transaction of a particular party, wherein the adaptation data is correlated to at least one aspect of speech of the particular party, and determining whether to update the adaptation data, said determination at least partly based on a result of at least a portion of the speech-facilitated transaction In addition to the foregoing, other aspects are described in the claims, drawings, and text.
-
公开(公告)号:US09899024B1
公开(公告)日:2018-02-20
申请号:US15393770
申请日:2016-12-29
申请人: Google Inc.
发明人: Dimitri Kanevsky , Golan Pundak
IPC分类号: G10L15/04 , G10L15/00 , G10L21/00 , G10L17/00 , G10L15/22 , G10L15/02 , G10L15/187 , G10L25/63 , G09B19/04
CPC分类号: G10L15/22 , G09B5/04 , G09B19/04 , G10L15/02 , G10L15/187 , G10L25/63 , G10L2015/025 , G10L2015/223
摘要: Methods, systems, and apparatus are described for inducing a user of a speech recognition system to adjust their own behavior. For example, in one implementation, a speech recognition system that allows children to control electronic devices can improve the child's speech development, by encouraging the child to speak more clearly. To do so, the speech recognition system can generate a phonetic representation of a term spoken by the child, and can determine whether the phonetic representation matches a particular canonical pronunciation of the particular term that is deemed age-appropriate for the child. Upon determining that the particular canonical pronunciation that matches the phonetic representation of the term spoken by the child is not age-appropriate, the speech recognition system can select and implement a variety of remediation strategies for inducing the child to repeat the term using a pronunciation that is considered age-appropriate.
-
公开(公告)号:US09875738B2
公开(公告)日:2018-01-23
申请号:US15614239
申请日:2017-06-05
申请人: Google Inc.
IPC分类号: G10L15/00 , G10L15/22 , G10L15/187 , G10L15/01
CPC分类号: G10L15/22 , G06F17/271 , G06F17/2765 , G06F17/30654 , G06F17/30663 , G06F17/30746 , G10L15/01 , G10L15/08 , G10L15/187 , G10L2015/223
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for natural language processing. One of the methods includes receiving a first voice query; generating a first recognition output; receiving a second voice query; determining from a recognition of the second voice query that the second voice query triggers a correction request; using the first recognition output and the second recognition to determine a plurality of candidate corrections; scoring each candidate correction; and generating a corrected recognition output for a particular candidate correction having a score that satisfies a threshold value.
-
-
-
-
-
-
-
-
-