-
公开(公告)号:US11775617B1
公开(公告)日:2023-10-03
申请号:US17201358
申请日:2021-03-15
Applicant: Amazon Technologies, Inc.
Inventor: Ayush Jaiswal , Yue Wu , Pradeep Natarajan , Premkumar Natarajan
IPC: G06K9/00 , G06F18/2413 , G06F16/53 , G06F40/20 , G06V10/40 , G06F18/22 , G06F18/2132
CPC classification number: G06F18/2413 , G06F16/53 , G06F18/2132 , G06F18/22 , G06F40/20 , G06V10/40
Abstract: Devices and techniques are generally described for class-agnostic object detection. In some examples, a first frame of image data comprising a first plurality of pixels may be received. First class-agnostic feature data representing the first plurality of pixels may be generated. A first object detection component may be used to determine that the first plurality of pixels corresponds to an arbitrary object represented in the first frame of image data based at least in part on the first class-agnostic feature data. Class-agnostic data indicating that the first plurality of pixels in the first frame of image data corresponds to the arbitrary object may be generated.
-
公开(公告)号:US12112752B1
公开(公告)日:2024-10-08
申请号:US17688279
申请日:2022-03-07
Applicant: Amazon Technologies, Inc.
Inventor: Rahul Gupta , Jwala Dhamala , Apurv Verma , Qingwen Ye , Mayur Himmatbhai Dabhi , Srinivasan Rengarajan Veeravanallur , Spyridon Matsoukas , Melanie C B Gens , Seyed Omid Razavi , Avni Khatri , Premkumar Natarajan
CPC classification number: G10L15/22 , G10L15/01 , G10L15/063 , G10L15/08 , G10L2015/0631 , G10L2015/223
Abstract: Devices and techniques are generally described for cohort determination in natural language processing. In various examples, a first natural language input to a natural language processing system may be determined. The first natural language input may be associated with a first account identifier. A first machine learning model may determine first data representing one or more words of the first natural language input. A second machine learning model may determine second data representing one or more acoustic characteristics of the first natural language input. Third data may be determined, the third data including a predicted performance for processing the first natural language input by the natural language processing system. The third data may be determined based on the first data representation and the second data representation.
-
公开(公告)号:US11574637B1
公开(公告)日:2023-02-07
申请号:US17014042
申请日:2020-09-08
Applicant: Amazon Technologies, Inc.
Inventor: Anoop Kumar , Anil K Ramakrishna , Sriram Venkatapathy , Rahul Gupta , Sankaranarayanan Ananthakrishnan , Premkumar Natarajan
Abstract: Techniques for using a federated learning framework to update machine learning models for spoken language understanding (SLU) system are described. The system determines which labeled data is needed to update the models based on the models generating an undesired response to an input. The system identifies users to solicit labeled data from, and sends a request to a user device to speak an input. The device generates labeled data using the spoken input, and updates the on-device models using the spoken input and the labeled data. The updated model data is provided to the system to enable the system to update the system-level (global) models.
-
公开(公告)号:US20220093093A1
公开(公告)日:2022-03-24
申请号:US17112227
申请日:2020-12-04
Applicant: Amazon Technologies, Inc.
Inventor: Prakash Krishnan , Arindam Mandal , Nikko Strom , Pradeep Natarajan , Ariya Rastrow , Shiv Naga Prasad Vitaladevuni , David Chi-Wai Tang , Aaron Challenner , Xu Zhang , Krishna Anisetty , Josey Diego Sandoval , Rohit Prasad , Premkumar Natarajan
Abstract: A system can operate a speech-controlled device in a mode where the speech-controlled device determines that an utterance is directed at the speech-controlled device using image data showing the user speaking the utterance. If the user is directing the user's gaze at the speech-controlled device while speaking, the system may determine the utterance is system directed and thus may perform further speech processing based on the utterance. If the user's gaze is directed elsewhere, the system may determine the utterance is not system directed (for example directed at another user) and thus the system may not perform further speech processing based on the utterance and may take other actions, for example discarding audio data of the utterance.
-
公开(公告)号:US11978437B1
公开(公告)日:2024-05-07
申请号:US17119099
申请日:2020-12-11
Applicant: Amazon Technologies, Inc.
Inventor: Govindarajan Sundaram Thattai , Qing Ping , Feiyang Niu , Joel Joseph Chengottusseriyil , Prashanth Rajagopal , Qiaozi Gao , Aishwarya Naresh Reganti , Gokhan Tur , Dilek Hakkani-Tur , Rohit Prasad , Premkumar Natarajan
CPC classification number: G10L15/1815 , G06F16/22 , G06F21/6218 , G10L15/22 , G10L15/30 , G10L15/1822 , G10L15/183 , G10L15/19 , G10L2015/223
Abstract: Devices and techniques are generally described for learning personalized concepts for natural language processing. In various examples, a first natural language input may be received. In some examples, a determination may be made that the first natural language input comprises non-actionable slot data. A dialog session may be initiated with the user. In some examples, first slot data that is indicated by the user during the dialog session may be determined. In various examples, data representing the first slot data may be stored in a database in association with the first natural language input.
-
-
-
-