-
公开(公告)号:US20240371378A1
公开(公告)日:2024-11-07
申请号:US18777427
申请日:2024-07-18
Applicant: Apple Inc.
Inventor: Saurabh ADYA , Sameer BADASKAR , Akanksha BINDAL , Ahmed S. HUSSEN ABDELAZIZ , Xiaochuan NIU , Alkeshkumar M. PATEL , Srikanth VISHNUBHOTLA
Abstract: Systems and processes for operating a digital assistant are provided. An example method for processing an image include receiving an image, generating, based on the image, a question corresponding to a first object in the image, generating, based on the image, a caption corresponding to a second object of the image, receiving an utterance from a user, and determining a plurality of speech recognition results from the utterance based on the question and the caption.
-
公开(公告)号:US20180182376A1
公开(公告)日:2018-06-28
申请号:US15459481
申请日:2017-03-15
Applicant: Apple Inc.
Inventor: Christophe J. VAN GYSEL , Yi SU , Xiaochuan NIU , Ilya OPARIN
Abstract: The present disclosure generally relates to processing speech or text using rank-reduced token representation. In one example process, speech input is received. A sequence of candidate words corresponding to the speech input is determined. The sequence of candidate words includes a current word and one or more previous words. A vector representation of the current word is determined from a set of trained parameters. A number of parameters in the set of trained parameters varies as a function of one or more linguistic characteristics of the current word. Using the vector representation of the current word, a probability of a next word given the current word and the one or more previous words is determined. A text representation of the speech input is displayed based on the determined probability.
-