-
公开(公告)号:US12067976B2
公开(公告)日:2024-08-20
申请号:US17489667
申请日:2021-09-29
Applicant: Intuit Inc.
Inventor: Byungkyu Kang , Alexander Zhicharevich , Kate Elizabeth Swift-Spong , Zhewen Fan , Elik Sror
IPC: G10L15/18 , G06F16/9538 , G06F40/284 , G06N20/00 , G10L15/06 , G10L15/16 , G10L15/22
CPC classification number: G10L15/18 , G06F16/9538 , G06F40/284 , G06N20/00 , G10L15/063 , G10L15/16 , G10L15/22
Abstract: A method including transcribing, into digital tokens, utterances from a conversation between an agent and a person. The method also includes embedding the digital tokens into an utterances tensor including sequences of the digital tokens. The method also includes obtaining a metadata tensor by encoding metadata related to the utterances into the metadata tensor. The method also includes executing a machine learning model which takes, as input, the utterances tensor and the metadata tensor, and which outputs a predicted source article predicted to be related to the utterances. The method also includes generating an interactive link to the predicted source article.
-
公开(公告)号:US20240265915A1
公开(公告)日:2024-08-08
申请号:US18639988
申请日:2024-04-19
Applicant: RAJIV TREHAN
Inventor: RAJIV TREHAN
CPC classification number: G10L15/18 , A63B24/0006 , A63B24/0062 , A63B24/0075 , A63B24/0087 , A63B71/0622 , G06N20/00 , G06V40/23 , G10L15/22 , A63B2024/0009 , A63B2024/0015 , A63B2024/0071 , A63B2024/0096 , A63B2071/0627 , A63B2220/806 , A63B2225/12
Abstract: The disclosure relates to system and method for AI assisted activity training. The method includes presenting a plurality of activity categories to a user and receiving a voice-based input from the user. The method uses an Natural Language Processing model to process received voice-based input to extract selection activity and activity attribute. Contemporaneous to receiving voice-based input, method presents multimedia content in conformance with activity and activity attribute. In response to initiation of the multimedia content, method detects initiation of user activity performance. The method captures video of user activity, overlays, by a smart mirror a pose skeletal model corresponding to user activity performance over reflection of user on smart mirror and process video using AI model to extract user performance parameters. Feedback may be generated based on overlaid pose skeletal model and differential between user performance parameters and target set of performance parameters.
-
公开(公告)号:US20240257808A1
公开(公告)日:2024-08-01
申请号:US18435024
申请日:2024-02-07
Applicant: Amazon Technologies, Inc.
Inventor: Robert John Mars
CPC classification number: G10L15/22 , G06F40/47 , G10L13/02 , G10L15/08 , G10L15/18 , G10L15/30 , G10L2015/088 , G10L2015/223
Abstract: A speech-processing system may provide access to one or more virtual assistants via a voice-controlled device. A user may leverage a first virtual assistant to translate a natural language command from a first language into a second language, which the device can forward to a second virtual assistant for processing. The device may receive a command from a user and send input data representing the command to a first speech-processing system representing the first virtual assistant. The device may receive a response in the form of a first natural language output from the first speech-processing system along with an indication that the first natural language output should be directed to a second speech-processing system representing the second virtual assistant. For example, the command may be in the first language, and the first natural language output may be in the second language, which is understandable by the second speech-processing system.
-
14.
公开(公告)号:US20240252106A1
公开(公告)日:2024-08-01
申请号:US18406418
申请日:2024-01-08
Applicant: THE CATHOLIC UNIVERSITY OF KOREA INDUSTRY-ACADEMIC COOPERATION FOUNDATION , UNIVERSITY OF SEOUL INDUSTRY COOPERATION FOUNDATION
Inventor: Seung Ho YANG , Young Il KIM , Ha Jin YU
CPC classification number: A61B5/4803 , A61B5/0042 , A61B5/055 , A61B5/4064 , A61B5/7267 , G10L15/063 , G10L15/18 , G10L25/66 , A61B2576/026 , G10L13/02
Abstract: Disclosed is a language area invasion determination apparatus including a memory unit including a language area invasion determination model, and a processor that controls an operation of the language area invasion determination model included in the memory unit. The processor trains the language area invasion determination model by using one or more training utterance data and outputs language area invasion determination data of an examiner by using test utterance data and the trained language area invasion determination model. The training utterance data and the test utterance data include utterance speech data of a speaker.
-
公开(公告)号:US20240249725A1
公开(公告)日:2024-07-25
申请号:US18425465
申请日:2024-01-29
Applicant: Amazon Technologies, Inc.
Inventor: Stanislaw Ignacy Pasko
IPC: G10L15/30 , G10L15/18 , H04L67/5683
CPC classification number: G10L15/30 , G10L15/18 , H04L67/5683
Abstract: A speech interface device is configured to receive response data from a remote speech processing system for responding to user speech. This response data may be enhanced with information such as a remote ASR result(s) and a remote NLU result(s). The response data from the remote speech processing system may include one or more cacheable status indicators associated with the NLU result(s) and/or remote directive data, which indicate whether the remote NLU result(s) and/or the remote directive data are individually cacheable. A caching component of the speech interface device allows for caching at least some of this cacheable remote speech processing information, and using the cached information locally on the speech interface device when responding to user speech in the future. This allows for responding to user speech, even when the speech interface device is unable to communicate with a remote speech processing system over a wide area network.
-
公开(公告)号:US20240242710A1
公开(公告)日:2024-07-18
申请号:US18302180
申请日:2023-04-18
Applicant: VIA TECHNOLOGIES, INC.
Inventor: Jiah-Hui LUO , Jing-Jing GUO
IPC: G10L15/065 , G10L15/16 , G10L15/18
CPC classification number: G10L15/065 , G10L15/16 , G10L15/18
Abstract: A system for updating language models is provided. The system includes a data-storage module, a data-update module, and a model-building module. The data-storage module is used for storing multiple pieces of corpus data that corresponds to multiple categories. The data-update module is used for storing a piece of new corpus data into the data-storage module. The piece of new corpus data corresponds to one of the categories. The model-building module is used for building a plurality of classified language models, and for updating one of the classified language models based on the piece of new corpus data stored in the data-storage module. The classified language model updated corresponds to the category that corresponds to the piece of new corpus data.
-
公开(公告)号:US20240221738A1
公开(公告)日:2024-07-04
申请号:US18338749
申请日:2023-06-21
Applicant: Wispr AI, Inc.
Inventor: Sahaj Garg , Tanay Kothari , Anthony Leonardo
Abstract: The techniques described herein relate to computerized methods and systems for integrating with a knowledge system. In some embodiments, a user interaction system may include a speech input device wearable on a user and configured to receive an electronic signal indicative of a user's speech muscle activation patterns when the user is speaking. In some embodiments, the electronic signal may include EMG data received from an EMG sensor on the speech input device. The system may include at least one processor configured to use a speech model and the electronic signal as input to the speech model to generate a text prompt. The at least one processor may use a knowledge system to take an action or generate a response based on the text prompt. In some embodiments, the system may provide context to the knowledge system.
-
公开(公告)号:US12008988B2
公开(公告)日:2024-06-11
申请号:US17065027
申请日:2020-10-07
Applicant: Samsung Electronics Co., Ltd.
Inventor: Hyeontaek Lim , Sejin Kwak , Youngjin Kim
Abstract: An electronic apparatus and a controlling method thereof are provided. The electronic apparatus includes a microphone, a camera, a memory configured to store at least one command, and at least one processor configured to, based on a first user voice being input from a user, provide a response to the first user voice, based on an audio signal including a voice being input while the response to the first user voice is provided, analyze an image captured by the camera and determine whether there is a second user voice uttered by the user in the audio signal, and based on determining that there is the second user voice uttered by the user in the audio signal, stop providing the response to the first user voice and obtain and provide a response to the second user voice.
-
公开(公告)号:US20240170109A1
公开(公告)日:2024-05-23
申请号:US18430067
申请日:2024-02-01
Applicant: Ellipsis Health, Inc.
Inventor: Elizabeth Shriberg , Michael Aratow , Mainul Islam , Amir Hossein Harati , Tomasz Rutowski , David Lin , Yang Lu , Farshid Haque , Robert D. Rogers
IPC: G16H10/20 , A61B5/00 , A61B5/16 , G09B19/00 , G10L25/66 , G16H50/20 , G16H50/30 , G06F16/24 , G10L15/18
CPC classification number: G16H10/20 , A61B5/164 , A61B5/165 , A61B5/4803 , G09B19/00 , G10L25/66 , G16H50/20 , G16H50/30 , A61B5/4088 , A61B5/7275 , G06F16/24 , G10L15/18
Abstract: The present disclosure provides systems and methods for assessing a mental state of a subject in a single session or over multiple different sessions, using for example an automated module to present and/or formulate at least one query based in part on one or more target mental states to be assessed. The query may be configured to elicit at least one response from the subject. The query may be transmitted in an audio, visual, and/or textual format to the subject to elicit the response. Data comprising the response from the subject can be received. The data can be processed using one or more individual, joint, or fused models. One or more assessments of the mental state associated with the subject can be generated for the single session, for each of the multiple different sessions, or upon completion of one or more sessions of the multiple different sessions.
-
公开(公告)号:US11990127B2
公开(公告)日:2024-05-21
申请号:US17946203
申请日:2022-09-16
Applicant: Amazon Technologies, Inc.
Inventor: Natalia Vladimirovna Mamkina , Naomi Bancroft , Nishant Kumar , Shamitha Somashekar
CPC classification number: G10L15/22 , G06F3/167 , G06F21/32 , G10L15/01 , G10L15/18 , G10L15/26 , G10L17/06
Abstract: Systems, methods, and devices for recognizing a user are disclosed. A speech-controlled device captures a spoken utterance, and sends audio data corresponding thereto to a server. The server determines content sources storing or having access to content responsive to the spoken utterance. The server also determines multiple users associated with a profile of the speech-controlled device. Using the audio data, the server may determine user recognition data with respect to each user indicated in the speech-controlled device's profile. The server may also receive user recognition confidence threshold data from each of the content sources. The server may determine user recognition data associated that satisfies (i.e., meets or exceeds) a most stringent (i.e., highest) of the user recognition confidence threshold data. Thereafter, the server may send data indicating a user associated with the user recognition data to all of the content sources.
-
-
-
-
-
-
-
-
-