-
公开(公告)号:US20170161268A1
公开(公告)日:2017-06-08
申请号:US15383986
申请日:2016-12-19
Applicant: Apple Inc.
Inventor: Sameer BADASKAR
CPC classification number: G06F17/3005 , G06F17/30023 , G06F17/30026 , G06F17/30038 , G06F17/30265 , G06F17/30684 , G10L15/22 , G10L15/26 , G10L15/265
Abstract: Methods and systems for searching for media items using a voice-based digital assistant are described. Natural language text strings corresponding to search queries are provided. The search queries include query terms. The text strings may correspond to speech inputs input by a user into an electronic device. At least one information source is searched to identify at least one parameter associated with at least one of the query terms. The parameters include at least one of a time parameter, a date parameter, or a geo-code parameter. The parameters are compared to tags of media items to identify matches. In some implementations, media items whose tags match the parameter are presented to the user.
-
公开(公告)号:US20240371378A1
公开(公告)日:2024-11-07
申请号:US18777427
申请日:2024-07-18
Applicant: Apple Inc.
Inventor: Saurabh ADYA , Sameer BADASKAR , Akanksha BINDAL , Ahmed S. HUSSEN ABDELAZIZ , Xiaochuan NIU , Alkeshkumar M. PATEL , Srikanth VISHNUBHOTLA
Abstract: Systems and processes for operating a digital assistant are provided. An example method for processing an image include receiving an image, generating, based on the image, a question corresponding to a first object in the image, generating, based on the image, a caption corresponding to a second object of the image, receiving an utterance from a user, and determining a plurality of speech recognition results from the utterance based on the question and the caption.
-
公开(公告)号:US20240404515A1
公开(公告)日:2024-12-05
申请号:US18589118
申请日:2024-02-27
Applicant: Apple Inc.
Inventor: David L. SALIM , Sameer BADASKAR , Andreas GARKUSCHA
IPC: G10L15/183 , G10L15/06 , G10L15/18 , G10L15/22
Abstract: An example process includes: receiving a first audio input; after receiving the first audio input, receiving a second audio input; displaying word(s) transcribed from the first audio input, where the word(s) are transcribed using a language model; in accordance with a determination that the first audio input satisfies a predetermined condition: generating a textual representation of the first audio input; and updating the language model based on the textual representation; in accordance with a determination, based on the updated language model, that the second audio input includes a valid command for the word(s): executing the valid command to modify the display of the words(s); and in accordance with a determination that the second audio input does not include a valid command for the word(s): forgoing executing, based on the second audio input, a command for the word(s).
-
-