-
1.
公开(公告)号:US20240304179A1
公开(公告)日:2024-09-12
申请号:US18596406
申请日:2024-03-05
Applicant: Samsung Electronics Co., Ltd.
Inventor: Euisung Kim , Aditya Jajodia , Cindy Sushen Tseng , Divya Neelagiri , Taeyeon Ki , Vijendra Raj Apsingekar
CPC classification number: G10L15/063 , G10L25/30
Abstract: A method includes receiving, by an automatic speech recognition (ASR)-based spoken language understanding (SLU) model, an input utterance using an audio input device. The method also includes, for each token of the input utterance, generating, using a shared ASR encoder of the ASR-based SLU model, an acoustic representation of acoustic features of the token (the shared ASR encoder including a first adapter layer); determining, using an ASR decoder of the ASR-based SLU model, a text representation of the token using the acoustic representation and any previous tokens (the ASR decoder including a second adapter layer); combining, using a fusion model of the ASR-based SLU model, the text representation and the acoustic representation to generate a joint representation, and determining, using an SLU decoder of the ASR-based SLU model, a semantic label associated with the token based on the joint representation and any previous semantic labels.
-
公开(公告)号:US12170079B2
公开(公告)日:2024-12-17
申请号:US17444367
申请日:2021-08-03
Applicant: Samsung Electronics Co., Ltd.
Inventor: Divya Neelagiri , Taeyeon Ki , Vijendra Raj Apsingekar
Abstract: A method includes training a set of teacher models. Training the set of teacher models includes, for each individual teacher model of the set of teacher models, training the individual teacher model to transcribe unlabeled audio samples and predict a pseudo labeled dataset having multiple labels. At least some of the unlabeled audio samples contain named entity (NE) audio data. At least some of the labels include transcribed NE labels corresponding to the NE audio data. The method also includes correcting at least some of the transcribed NE labels using user-specific NE textual data. The method further includes retraining the set of teacher models based on the pseudo labeled dataset from a selected one of the teacher models, where the selected one of the teacher models predicts the pseudo labeled dataset more accurately than other teacher models of the set of teacher models.
-
公开(公告)号:US20230040181A1
公开(公告)日:2023-02-09
申请号:US17444367
申请日:2021-08-03
Applicant: Samsung Electronics Co., Ltd.
Inventor: Divya Neelagiri , Taeyeon Ki , Vijendra Raj Apsingekar
Abstract: A method includes training a set of teacher models. Training the set of teacher models includes, for each individual teacher model of the set of teacher models, training the individual teacher model to transcribe unlabeled audio samples and predict a pseudo labeled dataset having multiple labels. At least some of the unlabeled audio samples contain named entity (NE) audio data. At least some of the labels include transcribed NE labels corresponding to the NE audio data. The method also includes correcting at least some of the transcribed NE labels using user-specific NE textual data. The method further includes retraining the set of teacher models based on the pseudo labeled dataset from a selected one of the teacher models, where the selected one of the teacher models predicts the pseudo labeled dataset more accurately than other teacher models of the set of teacher models.
-
公开(公告)号:US20250078824A1
公开(公告)日:2025-03-06
申请号:US18814275
申请日:2024-08-23
Applicant: Samsung Electronics Co., Ltd.
Inventor: Euisung Kim , Yun Tang , Taeyeon Ki , Divya Neelagiri , Vijendra Raj Apsingekar
IPC: G10L15/183 , G10L15/06
Abstract: A method includes receiving an utterance from an audio input device. The method also includes determining a context associated with the utterance. The method also includes providing the utterance as an input to a joint model for automatic speech recognition (ASR) and spoken language understanding (SLU), wherein the joint model operates in a single mode to perform both ASR and SLU or a dual mode to perform one of ASR or SLU depending on the context. The method also includes using an output of the joint model to perform an action requested in the utterance. The joint model is trained by training a shared encoder and a shared decoder using a text-to-text task and, after training the shared encoder and the shared decoder, training a speech encoder and the shared encoder using a speech self-supervised learning (SSL) learning task and a text-to-text task with a masked prediction loss.
-
公开(公告)号:US20230419958A1
公开(公告)日:2023-12-28
申请号:US17937692
申请日:2022-10-03
Applicant: Samsung Electronics Co., Ltd.
Inventor: Divya Neelagiri , Cindy Sushen Tseng , Vijendra Raj Apsingekar
IPC: G10L15/197 , G10L15/00 , G10L15/22
CPC classification number: G10L15/197 , G10L15/005 , G10L15/22
Abstract: A method includes obtaining an audio input of a person speaking, where the audio input is captured by an electronic device. The method also includes, for each of multiple language types, (i) determining a first probability that the person is speaking in the language type by applying a trained spoken language identification model to the audio input, (ii) determining at least one second probability that the person is speaking in the language type based on at least one characteristic of the person or the electronic device, and (iii) determining a score for the language type based on a weighted sum of the first and second probabilities. The method further includes identifying the language type associated with a highest score as a spoken language of the person in the audio input.
-
公开(公告)号:US20250149031A1
公开(公告)日:2025-05-08
申请号:US18816659
申请日:2024-08-27
Applicant: Samsung Electronics Co., Ltd.
Inventor: Aditya Jajodia , Akash Sahoo , Patrick Hegarty , Divya Neelagiri , Vijendra Raj Apsingekar
IPC: G10L15/197 , G10L13/02 , G10L15/06
Abstract: A method includes identifying, using an automated speech recognition (ASR) system, at least one named entity hypothesis from at least one audio input. The method also can include providing, using the ASR system, the identified at least one named entity to a large language model (LLM). The method also can include generating a prompt using an automated prompt generator. The method also can include processing, using the LLM, the identified at least one named entity hypothesis and the prompt to generate updated named entity recognition data. The method also can include providing the updated named entity recognition data back to the ASR system.
-
公开(公告)号:US11676062B2
公开(公告)日:2023-06-13
申请号:US16293523
申请日:2019-03-05
Applicant: Samsung Electronics Co., Ltd
Inventor: Anil Sunder Yadav , Gurmeet Singh , Divya Neelagiri
CPC classification number: G06N20/00 , G06N5/02 , G06N5/04 , G10L15/1815 , G10L15/22 , G10L15/32 , G10L2015/223
Abstract: A method, an electronic device, and non-transitory machine-readable medium are provided. The method includes receiving, on an electronic device, a request to perform an action. The method also includes deriving an aggregated predicted confidence level using one or more confidence levels. The one or more confidence levels are based on usage information and context of the electronic device. The method further includes determining an execution engine to process the request based on the aggregated predicted confidence level. The method additionally includes providing at least a portion of the request to the execution engine for processing.
-
公开(公告)号:US20190279106A1
公开(公告)日:2019-09-12
申请号:US16293523
申请日:2019-03-05
Applicant: Samsung Electronics Co., Ltd
Inventor: Anil Sunder Yadav , Gurmeet Singh , Divya Neelagiri
Abstract: A method, an electronic device, and non-transitory machine-readable medium are provided. The method includes receiving, on an electronic device, a request to perform an action. The method also includes deriving an aggregated predicted confidence level using one or more confidence levels. The one or more confidence levels are based on usage information and context of the electronic device. The method further includes determining an execution engine to process the request based on the aggregated predicted confidence level. The method additionally includes providing at least a portion of the request to the execution engine for processing.
-
-
-
-
-
-
-