专利检索 cpc:"G10L15/063" 第 8 页

71.

发明授权
Methods and apparatus for leveraging sentiment values in flagging and/or removal of real time workflows 有权

公开(公告)号：US11948557B2

公开(公告)日：2024-04-02

申请号：US17539282

申请日：2021-12-01

申请人： Bank of America Corporation

发明人： Ramakrishna R. Yannam , Isaac Persing , Emad Noorizadeh

IPC分类号： G10L15/18 , G06F3/04817 , G06F40/30 , G10L15/06 , G10L15/16 , G10L15/22 , G10L15/30

CPC分类号： G10L15/1815 , G06F3/04817 , G06F40/30 , G10L15/063 , G10L15/16 , G10L15/22 , G10L15/30

摘要： Aspects of the disclosure relate to using an apparatus for flagging and removing real time workflows that produce sub-optimal results. Such an apparatus may include an utterance sentiment classifier. The apparatus stores a hierarchy of rules. Each of the rules is associated with one or more rule signals. In response to receiving the one or more utterance signals, the classifier iterates through the hierarchy of rules in sequential order to identify a first rule for which the one or more utterance signals are a superset of the rule's one or more rule signals. In response to receiving the one or more alternate utterance signals from the signal extractor, the classifier may iterate through the hierarchy of rules in sequential order to identify the first rule in the hierarchy for which the one or more alternate utterance signals are a superset of the first rule's one or more rule signals.

72.

发明公开
SELF-SUPERVISED LEARNING METHOD BASED ON PERMUTATION INVARIANT CROSS ENTROPY AND ELECTRONIC DEVICE THEREOF 审中-公开

公开(公告)号：US20240105166A1

公开(公告)日：2024-03-28

申请号：US18350111

申请日：2023-07-11

申请人： ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

发明人： Hoon CHUNG , Byung Ok KANG , Yoonhyung KIM

IPC分类号： G10L15/16 , G10L15/06 , G10L15/065

CPC分类号： G10L15/16 , G10L15/063 , G10L15/065

摘要： Provided is a self-supervised learning method based on permutation invariant cross entropy. A self-supervised learning method based on permutation invariant cross entropy performed by an electronic device includes: defining a cross entropy loss function for pre-training of an end-to-end speech recognition model; configuring non-transcription speech corpus data composed only of speech as input data of the cross entropy loss function; setting all permutations of classes included in the non-transcription speech corpus data as an output target and calculating cross entropy losses for each class; and determining a minimum cross entropy loss among the calculated cross entropy losses for each class as a final loss.

73.

发明公开
METHOD FOR TRAINING MODEL, SPEECH RECOGNITION METHOD, APPARATUS, MEDIUM, AND DEVICE 审中-公开

公开(公告)号：US20240105162A1

公开(公告)日：2024-03-28

申请号：US18257403

申请日：2021-11-18

申请人： BEIJING YOUZHUJU NETWORK TECHNOLOGY CO.

发明人： Kang WANG

IPC分类号： G10L15/06 , G10L15/00 , G10L15/22

CPC分类号： G10L15/063 , G10L15/005 , G10L15/22

摘要： The present disclosure relates to a method for training a model, a speech recognition method, an apparatus, a medium, and a device, the method including: acquiring training data, wherein the training data includes labeled data of at least two languages; ranking the languages in a descending order of a quantity of the labeled data of each language to obtain a training order corresponding to the languages; and sequentially acquiring, in accordance with ranking of the languages indicated by the training order, target data corresponding to each language to perform iterative training on a preset model, to obtain a target speech recognition model, wherein the target data is determined in accordance with the labeled data of language(s) from first ranking to current ranking in the training order.

74.

发明授权
Electronic device and operating method thereof 有权

公开(公告)号：US11942077B2

公开(公告)日：2024-03-26

申请号：US17949741

申请日：2022-09-21

申请人： Samsung Electronics Co., Ltd.

发明人： Kyoungbo Min , Seungdo Choi , Doohwa Hong

IPC分类号： G10L15/22 , G10L13/00 , G10L15/06 , G10L15/16

CPC分类号： G10L15/063 , G10L13/00 , G10L15/16

摘要： An electronic device for providing a text-to-speech (TTS) service and an operating method therefor are provided. The operating method of the electronic device includes obtaining target voice data based on an utterance input of a specific speaker, determining a number of learning steps of the target voice data, based on data features including a data amount of the target voice data, generating a target model by training a pre-trained model pre-trained to convert text into an audio signal, by using the target voice data as training data, based on the determined number of learning steps, generating output data obtained by converting input text into an audio signal, by using the generated target model, and outputting the generated output data.

75.

发明授权
Phoneme-based contextualization for cross-lingual speech recognition in end-to-end models 有权

公开(公告)号：US11942076B2

公开(公告)日：2024-03-26

申请号：US17651315

申请日：2022-02-16

申请人： Google LLC

发明人： Ke Hu , Golan Pundak , Rohit Prakash Prabhavalkar , Antoine Jean Bruguier , Tara N. Sainath

IPC分类号： G10L15/30 , G10L15/02 , G10L15/06 , G10L15/187 , G10L15/193 , G10L15/28 , G10L15/32 , G10L25/30

CPC分类号： G10L15/063 , G10L15/02 , G10L15/187 , G10L15/193 , G10L15/285 , G10L15/32 , G10L25/30 , G10L2015/025

摘要： A method includes receiving audio data encoding an utterance spoken by a native speaker of a first language, and receiving a biasing term list including one or more terms in a second language different than the first language. The method also includes processing, using a speech recognition model, acoustic features derived from the audio data to generate speech recognition scores for both wordpieces and corresponding phoneme sequences in the first language. The method also includes rescoring the speech recognition scores for the phoneme sequences based on the one or more terms in the biasing term list, and executing, using the speech recognition scores for the wordpieces and the rescored speech recognition scores for the phoneme sequences, a decoding graph to generate a transcription for the utterance.

76.

发明公开
SELF-LEARNING NEUROMORPHIC ACOUSTIC MODEL FOR SPEECH RECOGNITION 审中-公开

公开(公告)号：US20240096313A1

公开(公告)日：2024-03-21

申请号：US17946523

申请日：2022-09-16

申请人： Accenture Global Solutions Limited

发明人： Lavinia Andreea Danielescu , Timothy M. Shea , Kenneth Michael Stewart , Noah Gideon Pacik-Nelson , Eric Michael Gallo

IPC分类号： G10L15/16 , G10L15/06 , G10L15/197 , G10L15/22 , G10L15/30 , G10L25/21

CPC分类号： G10L15/16 , G10L15/063 , G10L15/197 , G10L15/22 , G10L15/30 , G10L25/21 , G10L2015/0635 , G10L2015/223

摘要： Methods, systems, and apparatus, including computer programs encoded on computer storage media, for recognizing speech using a spiking neural network acoustic model implemented on a neuromorphic processor are described. In one aspect, a method includes receiving, a trained acoustic model implemented as a spiking neural network (SNN) on a neuromorphic processor of a client device, a set of feature coefficients that represent acoustic energy of input audio received from a microphone communicably coupled to the client device. The acoustic model is trained to predict speech sounds based on input feature coefficients. The acoustic model generates output data indicating predicted speech sounds corresponding to the set of feature coefficients that represent the input audio received from the microphone. The neuromorphic processor updates one or more parameters of the acoustic model using one or more learning rules and the predicted speech sounds of the output data.

77.

发明公开
SYSTEM FOR REPLY GENERATION 审中-公开

公开(公告)号：US20240096236A1

公开(公告)日：2024-03-21

申请号：US18038520

申请日：2021-11-09

申请人： ROLLS-ROYCE PLC

发明人： Stuart Brian MOSS , Muhannad Abdul Rahman ALOMARI , James Frederick Sebastian ARNEY

IPC分类号： G09B21/00 , G06F3/01 , G10L13/033 , G10L15/06 , G10L15/18 , G10L15/22

CPC分类号： G09B21/00 , G06F3/013 , G10L13/033 , G10L15/063 , G10L15/18 , G10L15/22

摘要： A device for generating conversational replies, including a processor with a memory; a speech input module, a user input module; a natural language processing module including one or more encoder-decode modules; the device being configured to: record portions of a conversation through the speech input module, use a speech recognition module to identify words in the conversation, and when one or more words have been recognised: generate one or more responses based on the one or more words using the natural language processing module; selecting a group of the context sensitive responses, prompt the user via the user input module to select a response from the group, output the selected response.

78.

发明公开
METHOD AND SYSTEM FOR PERSONALIZED MULTIMODAL RESPONSE GENERATION THROUGH VIRTUAL AGENTS 审中-公开

公开(公告)号：US20240095491A1

公开(公告)日：2024-03-21

申请号：US18527077

申请日：2023-12-01

申请人： Quantiphi, Inc.

发明人： Dagnachew Birru , Saisubramaniam Gopalakrishnan , Siva Prasad Sompalli , Varun V , Vishal Vaddina

IPC分类号： G06N3/006 , G06N3/0455 , G06N3/0475 , G10L15/06 , G10L15/183 , G10L15/22 , H04L51/02

CPC分类号： G06N3/006 , G06N3/0455 , G06N3/0475 , G10L15/063 , G10L15/183 , G10L15/22 , H04L51/02

摘要： A method and system for multimodal response generation through a virtual agent is provided herein. The method comprises retrieving information related to an input received by the virtual agent. The virtual agent employs an Artificial Intelligence (AI) model. The method further comprises generating a response corresponding to the input based on the retrieved information. The method may further comprises generating a plurality of prompts based on user characteristics and the input. The method may further comprises modifying the response based on the plurality of prompts to generate a multimodal response.

79.

发明公开
AUTOMATICALLY LOCATING RESPONSES TO PREVIOUSLY ASKED QUESTIONS IN A LIVE CHAT TRANSCRIPT USING ARTIFICIAL INTELLIGENCE (AI) 审中-公开

公开(公告)号：US20240086639A1

公开(公告)日：2024-03-14

申请号：US17931911

申请日：2022-09-14

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Sanket Jain , Krishnasuri Narayanam , Ratnakar Behera , Avinash Tukaram Mane , ZHENG XIE , JOY PATRA

IPC分类号： G06F40/30 , G06F40/279 , G10L15/06 , G10L15/18 , G10L15/22

CPC分类号： G06F40/30 , G06F40/279 , G10L15/063 , G10L15/1815 , G10L15/22

摘要： Method, computer program product, and computer system are provided. A model is trained, in real-time to identify likely duplicate questions. A level of duplication is identified between a question and a previously asked question in a meeting transcript. An asker is pointed to where in the meeting transcript the question was the previously asked. All duplicate questions are arranged in a single point question by topic. A new meeting transcript is generated and displayed to attendees, including each individual question and each single point question.

80.

发明公开
Engagement Measurement of Media Consumers Based on the Acoustic Environment 审中-公开

公开(公告)号：US20240079026A1

公开(公告)日：2024-03-07

申请号：US18349796

申请日：2023-07-10

申请人： The Nielsen Company (US), LLC

发明人： Meryem Berrada , John Stavropoulos

IPC分类号： G10L25/51 , G10L15/06 , G10L15/08 , G10L15/22

CPC分类号： G10L25/51 , G10L15/063 , G10L15/08 , G10L15/22 , G10L2015/088

摘要： Methods, apparatus, systems and articles of manufacture to measure engagement of media consumers based on acoustic environment are disclosed. Example apparatus disclosed herein are to identify media device audio data and ambient environment audio data from sensed audio data collected from an environment, and determine classification data for the media device audio data and the ambient environment audio data. Disclosed example apparatus are also to process the classification data with a machine learning model to calculate an engagement metric. Disclosed example apparatus are further to determine whether at least one individual is engaged with media in the environment based on the engagement metric.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类