Patent search cpc:"G10L15/06" Page 1

1.

发明公开
TRAINING AND USING A TRANSCRIPT GENERATION MODEL ON A MULTI-SPEAKER AUDIO STREAM 审中-公开

公开(公告)号：US20240257815A1

公开(公告)日：2024-08-01

申请号：US18632277

申请日：2024-04-10

Applicant: Microsoft Technology Licensing, LLC

Inventor： Naoyuki KANDA , Takuya YOSHIOKA , Zhuo CHEN , Jinyu LI , Yashesh GAUR , Zhong MENG , Xiaofei WANG , Xiong XIAO

IPC: G10L17/04 , G10L15/06 , G10L15/26

CPC classification number: G10L17/04 , G10L15/06 , G10L15/26

Abstract: The disclosure herein describes using a transcript generation model for generating a transcript from a multi-speaker audio stream. Audio data including overlapping speech of a plurality of speakers is obtained and a set of frame embeddings are generated from audio data frames of obtained audio data using an audio data encoder. A set of words and channel change (CC) symbols are generated from the set of frame embeddings using a transcript generation model. The CC symbols are included between pairs of adjacent words that are spoken by different people at the same time. The set of words and CC symbols are transformed into a plurality of transcript lines, wherein words of the set of words are sorted into transcript lines based on CC symbols, and a multi-speaker transcript is generated based on the plurality of transcript lines. The inclusion of CC symbols by the model enables efficient, accurate multi-speaker transcription.

2.

发明授权
Speech recognition for keywords 有权

公开(公告)号：US12026753B2

公开(公告)日：2024-07-02

申请号：US17308624

申请日：2021-05-05

Applicant: Google LLC

Inventor： Petar Aleksic , Pedro J. Moreno Mengibar

IPC: G06Q30/00 , G06Q30/0251 , G06Q30/0273 , G10L13/00 , G10L15/01 , G10L15/06 , G10L15/18 , G10L15/08 , G10L15/187 , G10L15/26

CPC classification number: G06Q30/0275 , G06Q30/0256 , G10L13/00 , G10L15/01 , G10L15/06 , G10L15/18 , G10L2015/088 , G10L15/187 , G10L15/26

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition are disclosed. In one aspect, a method includes receiving a candidate adword from an advertiser. The method further includes generating a score for the candidate adword based on a likelihood of a speech recognizer generating, based on an utterance of the candidate adword, a transcription that includes a word that is associated with an expected pronunciation of the candidate adword. The method further includes classifying, based at least on the score, the candidate adword as an appropriate adword for use in a bidding process for advertisements that are selected based on a transcription of a speech query or as not an appropriate adword for use in the bidding process for advertisements that are selected based on the transcription of the speech query.

3.

发明授权
Description support device and description support method 有权

公开(公告)号：US11942086B2

公开(公告)日：2024-03-26

申请号：US17125295

申请日：2020-12-17

Applicant: Panasonic Intellectual Property Management Co., Ltd.

Inventor： Natsuki Saeki , Shoichi Araki , Masakatsu Hoshimi , Takahiro Kamai

IPC: G10L15/22 , G06Q30/016 , G10L15/06

CPC classification number: G10L15/22 , G10L15/06 , G06Q30/016

Abstract: A description support device for displaying information on a topic to be checked in an utterance by a user, the description support device includes: an inputter to acquire input information indicating an utterance sentence corresponding to the utterance; a controller to generate information indicating a check result of the topic for the utterance sentence; and a display to display information generated by the controller, wherein the display is configured to display a checklist indicating whether or not the topic is described in the utterance sentence indicated by the input information sequentially acquired by the inputter, and wherein the display is configured to display, according to a likelihood of each utterance sentence, display information including the utterance sentence, the likelihood defining the check result of the topic in the checklist.

4.

发明授权
Hotword detection on multiple devices 有权

公开(公告)号：US11887603B2

公开(公告)日：2024-01-30

申请号：US17691698

申请日：2022-03-10

Applicant: GOOGLE LLC

Inventor： Diego Melendo Casado , Alexander H. Gruenstein , Jakob Nicolaus Foerster

IPC: G10L15/30 , G10L15/22 , G10L25/78 , H04L67/10 , G10L15/08 , G10L15/06

CPC classification number: G10L15/30 , G10L15/22 , G10L25/78 , H04L67/10 , G10L15/06 , G10L2015/088 , G10L2015/223 , H05K999/99

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword detection on multiple devices are disclosed, In one aspect, a method includes the actions of receiving audio data that corresponds to an utterance. The actions further include determining that the utterance likely includes a particular, predefined hotword. The actions further include transmitting (i) data indicating that the computing device likely received the particular, predefined hotword, (ii) data identifying the computing device, and (iii) data identifying a group of nearby computing devices that includes the computing device. The actions further include receiving an instruction to commence speech recognition processing on the audio data. The actions further include in response to receiving the instruction to commence speech recognition processing on the audio data, processing at least a portion of the audio data using an automated speech recognizer on the computing device.

5.

发明授权
Display apparatus and method for registration of user command 有权

公开(公告)号：US11862166B2

公开(公告)日：2024-01-02

申请号：US17961848

申请日：2022-10-07

Applicant: Samsung Electronics Co., Ltd.

Inventor： Nam-yeong Kwon , Kyung-mi Park

IPC: G10L15/22 , G10L15/06 , H04N21/422 , H04N21/439 , H04N21/482 , G06F3/16 , G10L15/02 , G10L15/10 , G10L15/187 , G10L15/08

CPC classification number: G10L15/22 , G06F3/167 , G10L15/02 , G10L15/06 , G10L15/10 , H04N21/42203 , H04N21/4394 , H04N21/482 , G10L15/187 , G10L2015/0638 , G10L2015/088 , G10L2015/221 , G10L2015/223 , G10L2015/225

Abstract: A display apparatus includes an input unit configured to receive a user command; an output unit configured to output a registration suitability determination result for the user command; and a processor configured to generate phonetic symbols for the user command, analyze the generated phonetic symbols to determine registration suitability for the user command, and control the output unit to output the registration suitability determination result for the user command. Therefore, the display apparatus may register a user command which is resistant to misrecognition and guarantees high recognition rate among user commands defined by a user.

6.

发明授权
Building a natural language understanding application using a received electronic record containing programming code including an interpret-block, an interpret-statement, a pattern expression and an action statement 有权

公开(公告)号：US11776533B2

公开(公告)日：2023-10-03

申请号：US17225997

申请日：2021-04-08

Applicant: SoundHound, Inc.

Inventor： Bernard Mont-Reynaud , Seyed M. Emami , Chris Wilson , Keyvan Mohajer

IPC: G10L15/00 , G10L15/18 , G06F40/205 , G06F8/30 , G10L15/06 , G10L15/22 , H04M3/493

CPC classification number: G10L15/18 , G06F8/31 , G06F40/205 , G10L15/06 , G10L15/22 , H04M3/4938

Abstract: A method of building a natural language understanding application is provided. The method includes receiving at least one electronic record containing programming code and creating executable code from the programming code. Further, the executable code, when executed by a processor, causes the processor to create a parse and an interpretation of a sequence of input tokens, the programming code includes an interpret-block and the interpret-block includes an interpret-statement. Additionally, the interpret-statement includes a pattern expression and the interpret-statement includes an action statement.

7.

发明公开
DETECTION OF LIVE SPEECH 审中-公开

公开(公告)号：US20230290335A1

公开(公告)日：2023-09-14

申请号：US18318269

申请日：2023-05-16

Applicant: Cirrus Logic International Semiconductor Ltd.

Inventor： John Paul LESSO , Toru IDO

IPC: G10L15/06 , G10L19/26 , G10L25/78

CPC classification number: G10L15/06 , G10L19/26 , G10L25/78 , G10L2025/937

Abstract: A method of detecting live speech comprises: receiving a signal containing speech; obtaining a first component of the received signal in a first frequency band, wherein the first frequency band includes audio frequencies; and obtaining a second component of the received signal in a second frequency band higher than the first frequency band. Then, modulation of the first component of the received signal is detected; modulation of the second component of the received signal is detected; and the modulation of the first component of the received signal and the modulation of the second component of the received signal are compared. It may then be determined that the speech may not be live speech, if the modulation of the first component of the received signal differs from the modulation of the second component of the received signal.

8.

发明授权
Determining hotword suitability 有权

公开(公告)号：US11741970B2

公开(公告)日：2023-08-29

申请号：US17570246

申请日：2022-01-06

Applicant: Google LLC

Inventor： Andrew E. Rubin , Johan Schalkwyk , Maria Carolina Parada San Martin

IPC: G10L15/00 , G10L17/24 , G06F21/32 , G10L25/51 , G06F21/46 , G10L15/22 , G10L15/06 , G10L15/08

CPC classification number: G10L17/24 , G06F21/32 , G06F21/46 , G10L15/06 , G10L15/08 , G10L15/22 , G10L25/51 , G10L2015/0638 , G10L2015/088 , G10L2015/225

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining hotword suitability. In one aspect, a method includes receiving speech data that encodes a candidate hotword spoken by a user, evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, generating a hotword suitability score for the candidate hotword based on evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, and providing a representation of the hotword suitability score for display to the user.

9.

发明公开
AUDIO DATA IDENTIFICATION APPARATUS 审中-公开

公开(公告)号：US20230178096A1

公开(公告)日：2023-06-08

申请号：US17911078

申请日：2021-02-26

Applicant: COCHL,INC.

Inventor： Ilyoung JEONG , Hyungui LIM , Yoonchang HAN , Subin LEE , Jeongsoo PARK , Donmoon LEE

IPC: G10L25/51 , G10L15/06 , G10L25/30

CPC classification number: G10L25/51 , G10L15/06 , G10L25/30

Abstract: Proposed is an audio data identification apparatus for collecting random audio data and identifying an audio resource obtained by exacting any one section of the collected audio data. The audio data identification apparatus includes: a communication unit that collects and transmits the random audio data; and a control unit that identifies the collected audio data. The control unit includes: a parsing unit that parses the collected audio data into predetermined units; an extraction unit that selects, as the audio resource, any one of a plurality of parsed sections of the audio data; a matching unit that matches identification information of the audio resource via a pre-loaded artificial intelligence algorithm; and a verification unit that verifies the identification information matched to the audio resource.

10.

发明授权
Method of providing voice command and electronic device supporting the same 有权

公开(公告)号：US11664027B2

公开(公告)日：2023-05-30

申请号：US17459327

申请日：2021-08-27

Applicant: Samsung Electronics Co., Ltd.

Inventor： Chakladar Subhojit , Sang Hoon Lee , Ji Min Lee

IPC: G10L15/22 , G10L15/065 , G10L15/06 , G10L15/30 , G10L15/183

CPC classification number: G10L15/22 , G10L15/06 , G10L15/065 , G10L15/183 , G10L15/30 , G10L2015/223

Abstract: Disclosed is a portable communication device, including a display, at least one microphone, a memory, and a processor operably connected to the display, the at least one microphone and the memory, wherein the processor is configured to display guide information, via the display, in response to a user input, the guide information including a first display object related to guide a user voice input for generation of a new voice command and a second display object related to at least one application executed by the new voice command via the portable communication device, receive audio data corresponding to the first display object from a user through the at least one microphone, generate the new voice command corresponding to the audio data, and store, in the memory, the new voice command corresponding to the received audio data and mapping information indicating that the new voice command and the at least one application are mapped.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification