专利检索 ipc:"G10L25/60" 第 1 页

1.

发明公开
ACTIVE VOICE LIVENESS DETECTION SYSTEM 审中-公开

公开(公告)号：US20240363125A1

公开(公告)日：2024-10-31

申请号：US18646493

申请日：2024-04-25

申请人： Pindrop Security, Inc.

发明人： Elie KHOURY , Ganesh SIVARAMAN , Tianxiang CHEN , Nikolay GAUBITCH , David LOONEY , Amit GUPTA , Vijay BALASUBRAMANIYAN , Nicholas KLEIN , Anthony STANKUS

IPC分类号： G10L17/26 , G10L17/02 , G10L17/04 , G10L25/60 , H04M3/22 , H04M3/51

CPC分类号： G10L17/26 , G10L17/02 , G10L17/04 , G10L25/60 , H04M3/2281 , H04M3/5183 , H04M2201/405

摘要： Disclosed are systems and methods including software processes executed by a server that detect audio-based synthetic speech (“deepfakes”) in a call conversation. Embodiments include systems and methods for detecting fraudulent presentation attacks using multiple functional engines that implement various fraud-detection techniques, to produce calibrated scores and/or fused scores. A computer may, for example, evaluate the audio quality of speech signals within audio signals, where speech signals contain the speech portions having speaker utterances.

2.

发明授权
Methods and systems for audio sample quality control 有权

公开(公告)号：US12020724B2

公开(公告)日：2024-06-25

申请号：US17841794

申请日：2022-06-16

申请人： Clearspeed Inc.

发明人： James A. Kane

IPC分类号： G10L25/60 , G10L25/63 , G10L25/84 , H04M3/22

CPC分类号： G10L25/60 , G10L25/63 , G10L25/84 , H04M3/2227 , H04M3/2236

摘要： The present disclosure provides methods and systems that may be used for providing quality control for audio samples. The audio samples may be speech samples of a user. The user may be participating in an audio interview.

3.

发明授权
Method and system for automatic detection and correction of sound caused by facial coverings 有权

公开(公告)号：US11967332B2

公开(公告)日：2024-04-23

申请号：US17477592

申请日：2021-09-17

申请人： International Business Machines Corporation

发明人： Girmaw Abebe Tadesse , Michael S. Gordon , Komminist Weldemariam

IPC分类号： G10L21/0232 , G10L25/60 , G10L25/75

CPC分类号： G10L21/0232 , G10L25/60 , G10L25/75

摘要： A computer-implemented method for correcting muffled speech caused by facial coverings is disclosed. The computer-implemented method includes monitoring a user's speech for speech distortion. The computer-implemented method further includes determining that the user's speech is distorted. The computer-implemented method further includes determining that a cause of the user's speech distortion is based, at least in part, on a presence of a particular type of facial covering. The computer-implemented method further includes automatically correcting the speech distortion of the user based, at least in part, on the particular type of facial covering causing the speech distortion.

4.

发明公开
QUALITY ESTIMATION MODEL FOR PACKET LOSS CONCEALMENT 审中-公开

公开(公告)号：US20240127848A1

公开(公告)日：2024-04-18

申请号：US18079342

申请日：2022-12-12

申请人： Microsoft Technology Licensing, LLC

发明人： Carl Lorenz DIENER

IPC分类号： G10L25/60 , G10L19/005 , G10L25/30 , G10L25/69

CPC分类号： G10L25/60 , G10L19/005 , G10L25/30 , G10L25/69 , H04L41/0681

摘要： This document relates to training and employing a quality estimation model. One example includes a method or technique that can be performed on a computing device. The method or technique can include providing degraded audio signals to one or more packet loss concealment models, and obtaining enhanced audio signals output by the one or more packet loss concealment models. The method or technique can also include obtaining quality labels for the enhanced audio signals and training a quality estimation model to estimate audio signal quality based at least on the enhanced audio signals and the quality labels.

5.

发明公开
Signal Processing Coordination Among Digital Voice Assistant Computing Devices 审中-公开

公开(公告)号：US20240119958A1

公开(公告)日：2024-04-11

申请号：US18488623

申请日：2023-10-17

申请人： Google LLC

发明人： Anshul Kothari , Gaurav Bhaya , Tarun Jain

IPC分类号： G10L25/60 , G06N20/00 , G10L25/03 , H04L12/28

CPC分类号： G10L25/60 , G06N20/00 , G10L25/03 , H04L12/282 , G10L2015/226

摘要： Coordinating signal processing among computing devices in a voice-driven computing environment is provided. A first and second digital assistant can detect an input audio signal, perform a signal quality check, and provide indications that the first and second digital assistants are operational to process the input audio signal. A system can select the first digital assistant for further processing. The system can receive, from the first digital assistant, data packets including a command. The system can generate, for a network connected device selected from a plurality of network connected devices, an action data structure based on the data packets, and transmit the action data structure to the selected network connected device.

6.

发明公开
METHOD AND SYSTEM FOR PERFORMING DATA AUGMENTATION BASED ON MODIFIED SURROGATES, AND, NON-TRANSITORY COMPUTER READABLE MEDIUM 审中-公开

公开(公告)号：US20240119956A1

公开(公告)日：2024-04-11

申请号：US17992473

申请日：2022-11-22

申请人： SAMSUNG ELETRÔNICA DA AMAZÔNIA LTDA.

发明人： Douglas DAVID BAPTISTA DE SOUZA , FERNANDA DE SOUZA FERREIRA , Guilherme ZUCATELLI NOSSA

IPC分类号： G10L25/18 , G10L21/007 , G10L21/0232 , G10L25/60

CPC分类号： G10L25/18 , G10L21/007 , G10L21/0232 , G10L25/60

摘要： A computer implemented data augmentation method comprising receiving a dataset to be processed and, upon the received dataset being unclassified into classes, performing a clustering algorithm to partition the dataset whereby clusters formed are interpreted as the signal classes. The method further includes forming a sample dataset by gathering, for each class of a plurality of classes, at least two sample signals then applying a discrete Fourier transform (DFT) to each sample signal of the sample dataset. The method includes computing frequency parameters of each sample signal to determine, based on a spectral coherence threshold, frequency bands: relevant bands that characterizes a class. The method further includes injecting random noise in a phase spectrum of the non-relevant frequency bands of each sample signal of the sample dataset, to generate a set of augmented sample signals, and applying an inverse DFT, in each of the generated augmented sample signals.

7.

发明授权
Data correction apparatus, data correction method, and program 有权

公开(公告)号：US11924368B2

公开(公告)日：2024-03-05

申请号：US17608823

申请日：2019-05-07

申请人： NIPPON TELEGRAPH AND TELEPHONE CORPORATION

发明人： Sachiko Kurihara , Noboru Harada

IPC分类号： H04M1/24 , G10L21/0232 , G10L25/60 , G10L25/84 , H04M3/08 , H04M3/22 , H04M3/26 , G10L21/0208

CPC分类号： H04M3/2236 , G10L21/0232 , G10L25/60 , G10L25/84 , H04M3/26 , G10L2021/02082

摘要： To improve accuracy of an evaluation in an acoustic quality evaluation test performed by comparing an evaluation target sound and a reference sound. A data correction apparatus 3 compares, in a call performed between a near-end terminal 1 and a far-end terminal 2, an evaluation target sound in which a voice output from the near-end terminal 1 is recorded and a reference sound in which a voice spoken by a call partner using the far-end terminal 2 to correct test data used in a listening test for evaluating acoustic quality of the call. A correction target determination unit 31 determines, as a correction target section, a voiced section that does not include the voice of the call partner detected from an acoustic signal representing the reference sound. A correction execution unit 32 updates the correction target section of the acoustic signal representing the reference sound with a non-voice signal predetermined.

8.

发明公开
Methods, Systems, and Devices for Spectrally Adjusting Audio Gain in Videoconference and Other Applications 审中-公开

公开(公告)号：US20240046950A1

公开(公告)日：2024-02-08

申请号：US17881355

申请日：2022-08-04

申请人： Motorola Mobility LLC

发明人： Vivek K Tyagi , Nikhil Ambha Madhusudhana , Kevin Villanueva , Chao Ma , Giles T Davis

IPC分类号： G10L21/034 , G06T7/70 , G06F3/16 , G10L15/06 , G10L25/60 , G10L15/25

CPC分类号： G10L21/034 , G06T7/70 , G06F3/167 , G10L15/063 , G10L25/60 , G10L15/25 , G06T2207/30201

摘要： An electronic device includes an imager capturing one or more images of a subject engaging the electronic device and an audio input receiving acoustic signals having audible frequencies from the mouth of the subject engaging the electronic device. One or more processors determine from the one or more images of the subject whether the mouth of the subject is oriented on-axis relative to the audio input or off-axis relative to the audio input. The one or more processors adjust a gain of the audio input associated with a subset of the audible frequencies when the mouth of the subject is oriented off-axis relative to the audio input.

9.

发明授权
Electronic device for speech recognition and control method thereof 有权

公开(公告)号：US11887617B2

公开(公告)日：2024-01-30

申请号：US17260684

申请日：2019-05-31

申请人： SAMSUNG ELECTRONICS CO., LTD.

发明人： Ki Hoon Shin , Jonguk Yoo , Sangmoon Lee

IPC分类号： G10L21/0216 , G10L21/0272 , G10L25/18 , G10L25/60

CPC分类号： G10L21/0216 , G10L21/0272 , G10L25/18 , G10L25/60 , G10L2021/02166

摘要： An electronic device for speech recognition includes a multi-channel microphone array required for remote speech recognition. The electronic device improves efficiency and performance of speech recognition of the electronic device in a space where noise other than speech to be recognized exists. A control method includes receiving a plurality of audio signals output from a plurality of sources through a plurality of microphones and analyzing the audio signals and obtaining information on directions in which the audio signals are input and information on input times of the audio signals. A target source for speech recognition among the plurality of sources is determined on the basis of the obtained information on the directions in which the plurality of audio signals are input, and the obtained information on the input times of the plurality of audio signals, and an audio signal obtained from the determined target source is processed.

10.

发明授权
Authenticating a user 有权

公开(公告)号：US11869513B2

公开(公告)日：2024-01-09

申请号：US17142775

申请日：2021-01-06

申请人： VERIDAS DIGITAL AUTHENTICATION SOLUTIONS, S.L.

发明人： Iván López Espejo , Santiago Prieto Calero , Ana Iriarte Ruiz , David Roncal Redín , Miguel Ángel Sánchez Yoldi , Eduardo Azanza Ladrón

IPC分类号： G10L17/08 , G10L17/04 , G10L17/22 , G06F21/32 , H04L9/40 , G10L17/12 , G10L25/60

CPC分类号： G10L17/08 , G06F21/32 , G10L17/04 , G10L17/22 , H04L63/0861 , G10L17/12 , G10L25/60

摘要： Methods of authenticating a user or speaker are provided. These methods include obtaining an input speech signal and user credentials identifying the user or speaker. The input speech signal includes a single-channel signal or a multi-channel speech signal. The methods further include extracting a speech voiceprint from the input speech signal, and retrieving a reference voiceprint associated to the user credentials. The methods still further include determining a voiceprint correspondence between the speech voiceprint and the reference voiceprint, and authenticating the user or speaker depending on said voiceprint correspondence. The methods yet further include updating the reference voiceprint depending on the speech voiceprint corresponding to the authenticated user or speaker. Computer programs, systems and computing systems are also provided which are suitable for performing said methods of authenticating a user or speaker.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类