Patent search ap:("Samsung Electronics Co. Page Ltd.") AND inv:"Sivakumar Balasubramanian"

1.

发明公开
SYSTEM AND METHOD FOR COMMAND FULFILLMENT WITHOUT WAKE WORD 审中-公开

公开(公告)号：US20240029723A1

公开(公告)日：2024-01-25

申请号：US17937198

申请日：2022-09-30

Applicant: Samsung Electronics Co., Ltd.

Inventor： Sivakumar Balasubramanian , Gowtham Srinivasan , Srinivasa Rao Ponakala , Vijendra Raj Apsingekar , Anil Sunder Yadav

IPC: G10L15/197 , G10L15/06 , G10L15/22

CPC classification number: G10L15/197 , G10L15/063 , G10L15/22 , G10L2015/223

Abstract: A method comprises obtaining an audio input. The method also includes providing at least a portion of the audio input to a frame-level detector model. The method also includes obtaining a first output of the frame-level detector model including frame-level predictions associated with at least the portion of the audio input. The method also includes providing at least one chunked audio frame to a word-level verifier model. The method also includes obtaining a second output of the word-level verifier model including word-level probabilities associated with the at least one chunked audio frame. The method also includes instructing performance of automatic speech recognition on the audio input based on the word-level probabilities associated with the at least one chunked audio frame.

2.

发明授权
System and method for accent-agnostic frame-level wake word detection 有权

公开(公告)号：US12272357B2

公开(公告)日：2025-04-08

申请号：US17929280

申请日：2022-09-01

Applicant: Samsung Electronics Co., Ltd.

Inventor： Sivakumar Balasubramanian , Gowtham Srinivasan , Srinivasa Rao Ponakala , Vijendra Raj Apsingekar , Anil Sunder Yadav

IPC: G10L15/22 , G10L15/06

Abstract: A method includes accessing, using at least one processor of an electronic device, a machine learning model. The machine learning model is a trained student model that is trained using audio samples in a plurality of accent types. The method also includes receiving, using the at least one processor, an audio input from an audio input device. The method further includes providing, using the at least one processor, the audio input to the trained student model. The method also includes receiving, using the at least one processor, an output from the trained student model including frame-level probabilities associated with the audio input. In addition, the method includes instructing, using the at least one processor, at least one action based on the frame-level probabilities associated with the audio input.

3.

发明授权
Method of generating a trigger word detection model, and an apparatus for the same 有权

公开(公告)号：US12236939B2

公开(公告)日：2025-02-25

申请号：US17499072

申请日：2021-10-12

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventor： Sivakumar Balasubramanian , Gowtham Srinivasan , Srinivasa Rao Ponakala , Anil Sunder Yadav , Aditya Jajodia

IPC: G10L15/05 , G06N3/096 , G10L15/06 , G10L15/08 , G10L15/16 , G10L15/20 , G10L15/22 , G10L15/30

Abstract: A method of generating a trained trigger word detection model includes training an auxiliary model, based on an auxiliary task, to concentrate on one or more utterances and/or learn context of the one or more utterances using generic single word and/or phrase training data; and obtaining a trigger word detection model by retraining one or more final layers of the auxiliary model, which is weighted based on the auxiliary task, based on a trigger word detection task that detects one or more trigger words. The retraining uses training data specific to the one or more trigger words.

4.

发明公开
SYSTEM AND METHOD FOR ACCENT-AGNOSTIC FRAME-LEVEL WAKE WORD DETECTION 审中-公开

公开(公告)号：US20230368786A1

公开(公告)日：2023-11-16

申请号：US17929280

申请日：2022-09-01

Applicant: Samsung Electronics Co., Ltd.

Inventor： Sivakumar Balasubramanian , Gowtham Srinivasan , Srinivasa Rao Ponakala , Vijendra Raj Apsingekar , Anil Sunder Yadav

IPC: G10L15/22 , G10L15/06

CPC classification number: G10L15/22 , G10L15/063 , G10L2015/223 , G10L2015/0631

Abstract: A method includes accessing, using at least one processor of an electronic device, a machine learning model. The machine learning model is a trained student model that is trained using audio samples in a plurality of accent types. The method also includes receiving, using the at least one processor, an audio input from an audio input device. The method further includes providing, using the at least one processor, the audio input to the trained student model. The method also includes receiving, using the at least one processor, an output from the trained student model including frame-level probabilities associated with the audio input. In addition, the method includes instructing, using the at least one processor, at least one action based on the frame-level probabilities associated with the audio input.

5.

发明公开
THREE-DIMENSIONAL (3D) SOUND RENDERING WITH MULTI-CHANNEL AUDIO BASED ON MONO AUDIO INPUT 审中-公开

公开(公告)号：US20240056761A1

公开(公告)日：2024-02-15

申请号：US18335730

申请日：2023-06-15

Applicant: Samsung Electronics Co., Ltd.

Inventor： Vijendra Raj Apsingekar , Akash Sahoo , Anil S. Yadav , Sivakumar Balasubramanian

IPC: H04S7/00 , H04S3/00 , G06F3/16 , G10L19/008

CPC classification number: H04S7/304 , H04S3/008 , G06F3/165 , G10L19/008 , H04S2400/11

Abstract: A method includes obtaining video content and associated substantially mono audio content. The method also includes determining at least one of a position or a motion trajectory of each of one or more objects detected in the video content and classifying each of the one or more objects into one of multiple object classes. The method further includes separating audio streams within the audio content based on the video content. Each of the audio streams is associated with one of multiple audio sources. The method also includes classifying each of the audio sources into one of the object classes. In addition, the method includes, for each audio source classified into the same object class as one of the one or more objects, distributing the audio stream associated with that audio source into multiple audio channels based on at least one of the position or the motion trajectory of that object.

Patent Agency Ranking