-
公开(公告)号:US11893999B1
公开(公告)日:2024-02-06
申请号:US16055755
申请日:2018-08-06
Applicant: Amazon Technologies, Inc.
Inventor: Sai Sailesh Kopuri , John Moore , Sundararajan Srinivasan , Aparna Khare , Arindam Mandal , Spyridon Matsoukas , Rohit Prasad
Abstract: Techniques for enrolling a user in a system's user recognition functionality without requiring the user speak particular speech are described. The system may determine characteristics unique to a user input. The system may generate an implicit voice profile from user inputs having similar characteristics. After an implicit voice profile is generated, the system may receive a user input having speech characteristics similar to that of the implicit voice profile. The system may ask the user if the user wants the system to associate the implicit voice profile with a particular user identifier. If the user responds affirmatively, the system may request an identifier of a user profile (e.g., a user name). In response to receiving the user's name, the system may identify a user profile associated with the name and associate the implicit voice profile with the user profile, thereby converting the implicit voice profile into an explicit voice profile.
-
公开(公告)号:US20240029730A1
公开(公告)日:2024-01-25
申请号:US18322918
申请日:2023-05-24
Applicant: Amazon Technologies, Inc.
Inventor: Rohit Prasad , Shiv Naga Prasad Vitaladevuni , Prem Natarajan
IPC: G10L15/22 , H04L67/306 , G10L15/18 , G10L15/06
CPC classification number: G10L15/22 , H04L67/306 , G10L15/1815 , G10L15/063 , G10L2015/223 , G10L2015/088
Abstract: Described are techniques for predicting when data associated with a user input is likely to be selected for deletion. The system may use a trained model to assist with such predictions. The trained model can be configured based on deletions associated with a user profile. An example process can including receiving user input data corresponding to the user profile, and processing the user input data to determine a user command. Based on characteristic data of the user command, the trained model can be used to determine that data corresponding to the user command is likely to be selected for deletion. The trained model can be iteratively updated based on additional user commands, including previously received user commands to delete user input data.
-
公开(公告)号:US11657804B2
公开(公告)日:2023-05-23
申请号:US17090716
申请日:2020-11-05
Applicant: Amazon Technologies, Inc.
Inventor: Rohit Prasad , Kenneth John Basye , Spyridon Matsoukas , Rajiv Ramachandran , Shiv Naga Prasad Vitaladevuni , Bjorn Hoffmeister
CPC classification number: G10L15/18 , G10L15/08 , G10L15/30 , G10L2015/088
Abstract: Features are disclosed for detecting words in audio using contextual information in addition to automatic speech recognition results. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.
-
公开(公告)号:US11527237B1
公开(公告)日:2022-12-13
申请号:US17024959
申请日:2020-09-18
Applicant: Amazon Technologies, Inc.
Inventor: Ruhi Sarikaya , Hung Tuan Pham , Savas Parastatidis , Dean Curtis , Pushpendre Rastogi , Nitin Ashok Jain , John Arland Nave , Abhinav Sethy , Arpit Gupta , Mayank Kumar , Nakul Dahiwade , Arshdeep Singh , Nikhil Reddy Kortha , Rohit Prasad
IPC: G10L13/00 , G10L15/16 , G06F16/9032 , G10L13/08
Abstract: Techniques for recommending a skill experience to a user after a user-system dialog session has ended are described. Upon a dialog session ending, the system uses a first machine learning model to determine potential intents to recommend to a user. The system then uses a second machine learning model to determine a particular skill and intent to recommend. The system then prompts the user to accept the recommended skill and intent. If the user accepts, the system calls the recommended skill to execute. As part of calling the skill, the system sends to the skill at least one entity provided in a natural language user input of the ended dialog session. This enables the skill to skip welcome prompts, and initiate processing to output a response based on the intent and the at least one entity of the ended dialog session.
-
公开(公告)号:US11496582B2
公开(公告)日:2022-11-08
申请号:US16455604
申请日:2019-06-27
Applicant: Amazon Technologies, Inc.
Inventor: Ariya Rastrow , Tony Hardie , Rohit Prasad
Abstract: Systems, methods, and devices for computer-generating responses and sending responses to communications when the recipient of the communication is unavailable are disclosed. An individual may send a message (either audio or text) to a recipient. The recipient may be unavailable to contemporaneously respond to the message (e.g., the recipient may be performing an action that makes is difficult or impractical for the recipient to contemporaneously respond to the audio message). When the recipient is unavailable, a response to the message is generated and sent without receiving an instruction from the recipient to do so. The response may be sent to the message originating individual, and content of the response may thereafter be sent to the recipient to receive feedback regarding the correctness of the response. Alternatively, the response content may first be sent to the recipient to receive the feedback, and thereafter the response may be sent to the message originating individual.
-
公开(公告)号:US11200885B1
公开(公告)日:2021-12-14
申请号:US16219228
申请日:2018-12-13
Applicant: Amazon Technologies, Inc.
Inventor: Arindam Mandal , Nikko Strom , Angeliki Metallinou , Tagyoung Chung , Dilek Hakkani-Tur , Suranjit Adhikari , Sridhar Yadav Manoharan , Ankita De , Qing Liu , Raefer Christopher Gabriel , Rohit Prasad
IPC: G10L15/22 , G10L21/00 , G10L15/06 , G10L15/18 , G06F16/332
Abstract: A dialog manager receives text data corresponding to a dialog with a user. Entities represented in the text data are identified. Context data relating to the dialog is maintained, which may include prior dialog, prior API calls, user profile information, or other data. Using the text data and the context data, an N-best list of one or more dialog models is selected to process the text data. After processing the text data, the outputs of the N-best models are ranked and a top-scoring output is selected. The top-scoring output may be an API call and/or an audio prompt.
-
公开(公告)号:US10049656B1
公开(公告)日:2018-08-14
申请号:US14033346
申请日:2013-09-20
Applicant: Amazon Technologies, Inc.
Inventor: William Folwell Barton , Rohit Prasad , Stephen Frederick Potter , Nikko Strom , Yuzo Watanabe , Madan Mohan Rao Jampani , Ariya Rastrow , Arushan Rajasekaram
Abstract: Features are disclosed for generating predictive personal natural language processing models based on user-specific profile information. The predictive personal models can provide broader coverage of the various terms, named entities, and/or intents of an utterance by the user than a personal model, while providing better accuracy than a general model. Profile information may be obtained from various data sources. Predictions regarding the content or subject of future user utterances may be made from the profile information. Predictive personal models may be generated based on the predictions. Future user utterances may be processed using the predictive personal models.
-
公开(公告)号:US11908467B1
公开(公告)日:2024-02-20
申请号:US17000886
申请日:2020-08-24
Applicant: Amazon Technologies, Inc.
Inventor: Rohit Prasad , Anna Santos , David Sanchez , Jared Strawderman , Sarah Castle , Kerry Hammil , Christopher Schindler , Timothy Twerdahl , Joseph Tavares , Bartosz Gulik
IPC: G10L21/00 , G10L25/00 , G10L15/22 , H04N21/422 , H04N21/478 , H04N21/482
CPC classification number: G10L15/22 , H04N21/42225 , H04N21/478 , H04N21/4828 , G10L2015/223
Abstract: Systems, methods, and computer-readable media are disclosed for dynamic voice search transitioning. Example methods may include receiving, by a computer system in communication with a display, a first incoming voice data indication, initiating a first user interface theme for presentation at a display, wherein the first user interface theme is a default user interface theme, and receiving first voice data. Example methods may include sending the first voice data to a remote server for processing, receiving an indication from the remote server to initiate a second user interface theme, and initiating the second user interface theme for presentation at the display.
-
公开(公告)号:US20210304774A1
公开(公告)日:2021-09-30
申请号:US17228950
申请日:2021-04-13
Applicant: Amazon Technologies, Inc.
Inventor: Sundararajan Srinivasan , Arindam Mandal , Krishna Subramanian , Spyridon Matsoukas , Aparna Khare , Rohit Prasad
Abstract: Techniques for updating voice profiles used to perform user recognition are described. A system may use clustering techniques to update voice profiles. When the system receives audio data representing a spoken user input, the system may store the audio data. Periodically, the system may recall, from storage, audio data (representing previous user inputs). The system may identify clusters of the audio data, with each cluster including similar or identical speech characteristics. The system may determine a cluster is substantially similar to an existing voice profile. If this occurs, the system may create an updated voice profile using the original voice profile and the cluster of audio data.
-
公开(公告)号:US20210134276A1
公开(公告)日:2021-05-06
申请号:US17090716
申请日:2020-11-05
Applicant: Amazon Technologies, Inc.
Inventor: Rohit Prasad , Kenneth John Basye , Spyridon Matsoukas , Rajiv Ramachandran , Shiv Naga Prasad Vitaladevuni , Bjorn Hoffmeister
Abstract: Features are disclosed for detecting words in audio using contextual information in addition to automatic speech recognition results. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.
-
-
-
-
-
-
-
-
-