Patent search ap:("Apple Inc.") AND inv:"Xiaochuan NIU" Page 1

1.

发明申请
USING VISUAL CONTEXT TO IMPROVE A VIRTUAL ASSISTANT 有权

公开(公告)号：US20240371378A1

公开(公告)日：2024-11-07

申请号：US18777427

申请日：2024-07-18

Applicant: Apple Inc.

Inventor： Saurabh ADYA , Sameer BADASKAR , Akanksha BINDAL , Ahmed S. HUSSEN ABDELAZIZ , Xiaochuan NIU , Alkeshkumar M. PATEL , Srikanth VISHNUBHOTLA

IPC: G10L15/22 , G06F18/214 , G06V10/82 , G06V20/50 , G10L15/06 , G10L15/16 , G10L15/18 , G10L15/24

Abstract: Systems and processes for operating a digital assistant are provided. An example method for processing an image include receiving an image, generating, based on the image, a question corresponding to a first object in the image, generating, based on the image, a caption corresponding to a second object of the image, receiving an utterance from a user, and determining a plurality of speech recognition results from the utterance based on the question and the caption.

2.

发明申请
RANK-REDUCED TOKEN REPRESENTATION FOR AUTOMATIC SPEECH RECOGNITION 审中-公开

公开(公告)号：US20180182376A1

公开(公告)日：2018-06-28

申请号：US15459481

申请日：2017-03-15

Applicant: Apple Inc.

Inventor： Christophe J. VAN GYSEL , Yi SU , Xiaochuan NIU , Ilya OPARIN

IPC: G10L15/02 , G10L15/18 , G10L15/06 , G10L15/16 , G10L15/183 , G10L15/14 , G10L21/10

Abstract: The present disclosure generally relates to processing speech or text using rank-reduced token representation. In one example process, speech input is received. A sequence of candidate words corresponding to the speech input is determined. The sequence of candidate words includes a current word and one or more previous words. A vector representation of the current word is determined from a set of trained parameters. A number of parameters in the set of trained parameters varies as a function of one or more linguistic characteristics of the current word. Using the vector representation of the current word, a probability of a next word given the current word and the one or more previous words is determined. A text representation of the speech input is displayed based on the determined probability.

Patent Agency Ranking