-
公开(公告)号:US11676575B2
公开(公告)日:2023-06-13
申请号:US17386078
申请日:2021-07-27
Applicant: Amazon Technologies, Inc.
Inventor: Ariya Rastrow , Rohit Prasad , Nikko Strom
CPC classification number: G10L15/063 , G10L15/18 , G10L15/30 , G10L2015/0635
Abstract: A speech interface device is configured to receive response data from a remote speech processing system for responding to user speech. This response data may be enhanced with information such as remote NLU data. The response data from the remote speech processing system may be compared to local NLU data to improve a speech processing model on the device. Thus, the device may perform supervised on-device learning based on the remote NLU data. The device may determine differences between the updated speech processing model and an original speech processing model received from the remote system and may send data indicating these differences to the remote system. The remote system may aggregate data received from a plurality of devices and may generate an improved speech processing model.
-
公开(公告)号:US20220020357A1
公开(公告)日:2022-01-20
申请号:US17386078
申请日:2021-07-27
Applicant: Amazon Technologies, Inc.
Inventor: Ariya Rastrow , Rohit Prasad , Nikko Strom
Abstract: A speech interface device is configured to receive response data from a remote speech processing system for responding to user speech. This response data may be enhanced with information such as remote NLU data. The response data from the remote speech processing system may be compared to local NLU data to improve a speech processing model on the device. Thus, the device may perform supervised on-device learning based on the remote NLU data. The device may determine differences between the updated speech processing model and an original speech processing model received from the remote system and may send data indicating these differences to the remote system. The remote system may aggregate data received from a plurality of devices and may generate an improved speech processing model.
-
公开(公告)号:US11184412B1
公开(公告)日:2021-11-23
申请号:US16869031
申请日:2020-05-07
Applicant: Amazon Technologies, Inc.
Inventor: Michael Martin George , Maria Christine Renz , Jeffrey P. Bezos , Gregory Michael Hart , Rohit Prasad , Brian Oliver , Jae Pum Park
Abstract: Described are systems, methods, and apparatus that enable constraint based communications between two or more devices. For example, a first user of a first device may submit a communication request to establish a communication session with a second user and provide a constraint for that communication session, such as a time-limit (e.g., limit the communication session to five minutes). In such an example, if the second user accepts the communication request with the constraint, a communication session is established and the system monitors the communication session to determine when a condition corresponding to the constraint has been satisfied. When the condition is satisfied, the communication session is terminated by the system.
-
公开(公告)号:US11004454B1
公开(公告)日:2021-05-11
申请号:US16182021
申请日:2018-11-06
Applicant: Amazon Technologies, Inc.
Inventor: Sundararajan Srinivasan , Arindam Mandal , Krishna Subramanian , Spyridon Matsoukas , Aparna Khare , Rohit Prasad
Abstract: Techniques for updating voice profiles used to perform user recognition are described. A system may use clustering techniques to update voice profiles. When the system receives audio data representing a spoken user input, the system may store the audio data. Periodically, the system may recall, from storage, audio data (representing previous user inputs). The system may identify clusters of the audio data, with each cluster including similar or identical speech characteristics. The system may determine a cluster is substantially similar to an existing voice profile. If this occurs, the system may create an updated voice profile using the original voice profile and the cluster of audio data.
-
公开(公告)号:US10832662B2
公开(公告)日:2020-11-10
申请号:US15641169
申请日:2017-07-03
Applicant: Amazon Technologies, Inc.
Inventor: Rohit Prasad , Kenneth John Basye , Spyridon Matsoukas , Rajiv Ramachandran , Shiv Naga Prasad Vitaladevuni , Bjorn Hoffmeister
Abstract: Features are disclosed for detecting words in audio using contextual information in addition to automatic speech recognition results. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.
-
公开(公告)号:US20200251104A1
公开(公告)日:2020-08-06
申请号:US16786629
申请日:2020-02-10
Applicant: Amazon Technologies, Inc.
Inventor: Andrew Smith , Christopher Schindler , Karthik Ramakrishnan , Rohit Prasad , Michael George , Rafal Kuklinski
IPC: G10L15/20 , G10L15/18 , G10L13/10 , G10L13/033
Abstract: Techniques for ensuring content output to a user conforms to a quality of the user's speech, even when a speechlet or skill ignores the speech's quality, are described. When a system receives speech, the system determines an indicator of the speech's quality (e.g., whispered, shouted, fast, slow, etc.) and persists the indicator in memory. When the system receives output content from a speechlet or skill, the system checks whether the output content is in conformity with the speech quality indicator. If the content conforms to the speech quality indicator, the system may cause the content to be output to the user without further manipulation. But, if the content does not conform to the speech quality indicator, the system may manipulate the content to render it in conformity with the speech quality indicator and output the manipulated content to the user.
-
公开(公告)号:US20200152195A1
公开(公告)日:2020-05-14
申请号:US16693826
申请日:2019-11-25
Applicant: Amazon Technologies, Inc.
Inventor: Ruhi Sarikaya , Rohit Prasad , Kerry Hammil , Spyridon Matsoukas , Nikko Strom , Frédéric Johan Georges Deramat , Stephen Frederick Potter , Young-Bum Kim
IPC: G10L15/22 , G06F40/295 , G10L15/26 , G10L15/08
Abstract: Techniques for limiting natural language processing performed on input data are described. A system receives input data from a device. The input data corresponds to a command to be executed by the system. The system determines applications likely configured to execute the command. The system performs named entity recognition and intent classification with respect to only the applications likely configured to execute the command.
-
公开(公告)号:US10600419B1
公开(公告)日:2020-03-24
申请号:US15712676
申请日:2017-09-22
Applicant: Amazon Technologies, Inc.
Inventor: Ruhi Sarikaya , Rohit Prasad , Kerry Hammil , Spyridon Matsoukas , Nikko Strom , Frédéric Johan Georges Deramat , Stephen Frederick Potter , Young-Bum Kim
Abstract: Techniques for performing command processing are described. A system receives, from a device, input data corresponding to a command. The system determines NLU processing results associated with multiple applications. The system also determines NLU confidences for the NLU processing results for each application. The system sends NLU processing results to a portion of the multiple applications, and receives output data or instructions from the portion of the applications. The system ranks the portion of the applications based at least in part on the NLU processing results associated with the portion of the applications as well as the output data or instructions provided by the portion of the applications. The system may also rank the portion of the applications using other data. The system causes content corresponding to output data or instructions provided by the highest ranked application to be output to a user.
-
公开(公告)号:US10339925B1
公开(公告)日:2019-07-02
申请号:US15276316
申请日:2016-09-26
Applicant: Amazon Technologies, Inc.
Inventor: Ariya Rastrow , Tony Hardie , Rohit Prasad
Abstract: Systems, methods, and devices for computer-generating responses and sending responses to communications when the recipient of the communication is unavailable are disclosed. An individual may send a message (either audio or text) to a recipient. The recipient may be unavailable to contemporaneously respond to the message (e.g., the recipient may be performing an action that makes is difficult or impractical for the recipient to contemporaneously respond to the audio message). When the recipient is unavailable, a response to the message is generated and sent without receiving an instruction from the recipient to do so. The response may be sent to the message originating individual, and content of the response may thereafter be sent to the recipient to receive feedback regarding the correctness of the response. Alternatively, the response content may first be sent to the recipient to receive the feedback, and thereafter the response may be sent to the message originating individual.
-
公开(公告)号:US09224061B1
公开(公告)日:2015-12-29
申请号:US14464365
申请日:2014-08-20
Applicant: Amazon Technologies, Inc.
Inventor: Pradeep Natarajan , Avnish Sikka , Rohit Prasad
CPC classification number: G06K9/3208 , G06K9/3258 , G06K2209/01
Abstract: A system estimates text orientation in images captured using a handheld camera prior detecting text in the image. Text orientation is estimated based on edges detected within the image, and the image is rotated based on the estimated orientation. Text detection and processing is then performed on the rotated image. Non-text features along a periphery of the image may be sampled to assure that clutter will not undermine the estimation of orientation.
Abstract translation: 系统估计在检测图像中的文本之前使用手持相机拍摄的图像中的文本方向。 基于在图像内检测到的边缘估计文本取向,并且基于估计的方向旋转图像。 然后对旋转的图像执行文本检测和处理。 可以对图像周边的非文本特征进行采样,以确保杂波不会破坏取向的估计。