Patent search cpc:"G10L2025/937" Page 1

1.

发明公开
DATA PROCESSING METHOD FOR ACOUSTIC EVENT 审中-公开

公开(公告)号：US20240194217A1

公开(公告)日：2024-06-13

申请号：US18089189

申请日：2022-12-27

Applicant: INDUSTRIAL TECHNOLOGY RESEARCH INSTITUTE

Inventor： Chih-Cheng LU , Jian-Bai LI , Cheng-Ming SHIH , Yu-Lee YEH , Kai-Cheung JUANG

IPC: G10L25/18

CPC classification number: G10L25/18 , G10L2025/937

Abstract: A data processing method for acoustic event includes: establishing a simulated acoustic frequency event module, a data capturing module, and a sound application decision module in a software manner, setting a simulated hardware parameter to the simulated acoustic frequency event module, inputting a sound signal to a frequency filtering module of the simulated acoustic frequency event module, and obtaining metadata from a frequency event quantizer of the simulated acoustic frequency event module, dividing each of the metadata into multiple frames according to a time interval by the data capturing module, accumulating an event number of each frame by the data capturing module, setting a label of each frame according to the event number, storing these frames, the event number and the label in a database, and training a decision model by the sound application decision module according to the database and a sound application.

2.

发明公开
VOICE ACTIVITY DETECTION METHOD AND APPARATUS, AND STORAGE MEDIUM 审中-公开

公开(公告)号：US20230186943A1

公开(公告)日：2023-06-15

申请号：US17893895

申请日：2022-08-23

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Guochang ZHANG , Libiao YU , Jianqiang WEI

IPC: G10L25/78 , G10L25/93

CPC classification number: G10L25/78 , G10L25/93 , G10L2025/937

Abstract: Provided are a voice activity detection method and apparatus, an electronic device and a storage medium, which relate to the technical field of voice processing, for example, to the technical field of artificial intelligence and deep learning. The specific implementation solution is described below. A first audio signal is acquired, and a frequency domain feature of the first audio signal is extracted; and the frequency domain feature of the first audio signal is input into a voice activity detection model, and a voice presence detection result output by the voice activity detection model is obtained, where the voice activity detection model is configured to detect whether voice is present in the first audio signal.

3.

发明授权
Methods and systems for providing consistency in noise reduction during speech and non-speech periods 有权

公开(公告)号：US09812149B2

公开(公告)日：2017-11-07

申请号：US15009740

申请日：2016-01-28

Applicant: Knowles Electronics, LLC

Inventor： Kuan-Chieh Yen

IPC: G10L21/00 , G10L21/0216 , G10L25/21 , G10L25/93 , G10L21/0232 , H04R3/00

CPC classification number: G10L21/0216 , G10L21/02 , G10L21/0232 , G10L21/0364 , G10L25/21 , G10L25/93 , G10L2021/02166 , G10L2021/02168 , G10L2025/937 , H04R1/1016 , H04R1/1083 , H04R3/005 , H04R2410/05

Abstract: Methods and systems for providing consistency in noise reduction during speech and non-speech periods are provided. First and second signals are received. The first signal includes at least a voice component. The second signal includes at least the voice component modified by human tissue of a user. First and second weights may be assigned per subband to the first and second signals, respectively. The first and second signals are processed to obtain respective first and second full-band power estimates. During periods when the user's speech is not present, the first weight and the second weight are adjusted based at least partially on the first full-band power estimate and the second full-band power estimate. The first and second signals are blended based on the adjusted weights to generate an enhanced voice signal. The second signal may be aligned with the first signal prior to the blending.

4.

发明授权
Enhancement of multichannel audio 有权

公开(公告)号：US09368128B2

公开(公告)日：2016-06-14

申请号：US14605003

申请日：2015-01-26

Applicant: Dolby Laboratories Licensing Corporation

Inventor： Hannes Muesch

IPC: G10L25/78 , G10L21/02 , G10L21/0364 , G10L19/012

CPC classification number: G10L25/78 , G10L19/012 , G10L19/018 , G10L21/02 , G10L21/0205 , G10L21/0364 , G10L25/93 , G10L2025/932 , G10L2025/937

Abstract: The invention relates to audio signal processing. More specifically, the invention relates to enhancing multichannel audio, such as television audio, by applying a gain to the audio that has been smoothed between portions of the audio. The invention relates to methods, apparatus for performing such methods, and to software stored on a computer-readable medium for causing a computer to perform such methods.

5.

发明申请
Spectral Comb Voice Activity Detection 有权
Title translation: 光谱梳声音活动检测

公开(公告)号：US20150162021A1

公开(公告)日：2015-06-11

申请号：US14099892

申请日：2013-12-06

Applicant: Malaspina Labs (Barbados), Inc.

Inventor： Alireza Kenarsari Anhari , Alexander Escott , Pierre Zakarauskas

IPC: G10L25/78

CPC classification number: G10L25/78 , G10L2025/783 , G10L2025/937

Abstract: The various implementations described enable voice activity detection and/or pitch estimation for speech signal processing in, for example and without limitation, hearing aids, speech recognition and interpretation software, telephony, and various applications for smartphones and/or wearable devices. In particular, some implementations include systems, methods and/or devices operable to detect voice activity in an audible signal by determining a voice activity indicator value that is a normalized function of signal amplitudes associated with at least two sets of spectral locations associated with a candidate pitch. In some implementations, voice activity is considered detected when the voice activity indicator value breaches a threshold value. Additionally and/or alternatively, in some implementations, analysis of the audible signal provides a pitch estimate of detectable voice activity.

Abstract translation: 所描述的各种实现使得能够在例如但不限于助听器，语音识别和解释软件，电话以及用于智能电话和/或可穿戴设备的各种应用中的语音信号处理的语音活动检测和/或音调估计。特别地，一些实施方式包括可操作以通过确定语音活动指标值来检测可听信号中的语音活动的系统，方法和/或设备，所述语音活动指标值是与候选者相关联的至少两组频谱位置相关联的信号幅度的归一化函数沥青。在一些实现中，当语音活动指示符值违反阈值时，认为检测到语音活动。另外和/或替代地，在一些实现中，可听信号的分析提供可检测语音活动的音高估计。

6.

发明申请
Enhancement of Multichannel Audio 有权
Title translation: 增强多声道音频

公开(公告)号：US20150142424A1

公开(公告)日：2015-05-21

申请号：US14605003

申请日：2015-01-26

Applicant: DOLBY LABORATORIES LICENSING CORPORATION

Inventor： Hannes Muesch

IPC: G10L25/78 , G10L21/02

CPC classification number: G10L25/78 , G10L19/012 , G10L19/018 , G10L21/02 , G10L21/0205 , G10L21/0364 , G10L25/93 , G10L2025/932 , G10L2025/937

Abstract: The invention relates to audio signal processing. More specifically, the invention relates to enhancing multichannel audio, such as television audio, by applying a gain to the audio that has been smoothed between portions of the audio. The invention relates to methods, apparatus for performing such methods, and to software stored on a computer-readable medium for causing a computer to perform such methods.

Abstract translation: 本发明涉及音频信号处理。更具体地，本发明涉及通过对音频部分之间已被平滑的音频应用增益来增强诸如电视音频的多声道音频。本发明涉及用于执行这些方法的方法，装置以及存储在计算机可读介质上的软件，用于使计算机执行这些方法。

7.

发明授权
Sound processing apparatus, sound processing method and program 有权
Title translation: 声音处理装置，声音处理方法和程序

公开(公告)号：US08996367B2

公开(公告)日：2015-03-31

申请号：US12611906

申请日：2009-11-03

Applicant: Ryuichi Namba , Mototsugu Abe , Masayuki Nishiguchi

Inventor： Ryuichi Namba , Mototsugu Abe , Masayuki Nishiguchi

IPC: G10L15/00 , H04R3/00 , G10L25/18 , G10L25/93

CPC classification number: H04R3/005 , G10L25/18 , G10L2025/937 , H04R2430/21 , H04R2499/11

Abstract: There is provided a sound processing apparatus including a sound separation unit that separates an input sound into a plurality of sounds caused by a plurality of sound sources, a sound type estimation unit that estimates sound types of the plurality of sounds separated by the sound separation unit, a mixing ratio calculation unit that calculates a mixing ratio of each sound in accordance with the sound type estimated by the sound type estimation unit, and a sound mixing unit that mixes the plurality of sounds separated by the sound separation unit in the mixing ratio calculated by the mixing ratio calculation unit.

Abstract translation: 提供了一种声音处理装置，包括将输入声音分离成由多个声源引起的多个声音的声音分离单元，声音类型估计单元，其估计由声音分离单元分离的多个声音的声音类型，混合比率计算单元，其根据由声音类型估计单元估计的声音类型计算每个声音的混合比，以及声音混合单元，将由声音分离单元分离的多个声音混合在计算的混合比中通过混合比计算单元。

8.

发明申请
ACOUSTIC VOICE ACTIVITY DETECTION (AVAD) FOR ELECTRONIC SYSTEMS 审中-公开
Title translation: 用于电子系统的声音语音活动检测（AVAD）

公开(公告)号：US20140126744A1

公开(公告)日：2014-05-08

申请号：US13669375

申请日：2012-11-05

Applicant: Nicolas Petit , Gregory Burnett , Zhinian Jing

Inventor： Nicolas Petit , Gregory Burnett , Zhinian Jing

IPC: H04R3/00

CPC classification number: H04R3/005 , G10L25/78 , G10L2021/02165 , G10L2025/783 , G10L2025/937

Abstract: Acoustic Voice Activity Detection (AVAD) methods and systems are described. The AVAD methods and systems, including corresponding algorithms or programs, use microphones to generate virtual directional microphones which have very similar noise responses and very dissimilar speech responses. The ratio of the energies of the virtual microphones is then calculated over a given window size and the ratio can then be used with a variety of methods to generate a VAD signal. The virtual microphones can be constructed using either an adaptive or a fixed filter.

Abstract translation: 描述声学声音活动检测（AVAD）方法和系统。 AVAD方法和系统（包括相应的算法或程序）使用麦克风来产生具有非常相似的噪声响应和非常不相似语音响应的虚拟定向麦克风。然后在给定的窗口尺寸上计算虚拟麦克风的能量的比率，然后可以使用各种方法来产生VAD信号。可以使用自适应或固定滤波器来构建虚拟麦克风。

9.

发明授权
Harmonic structure based acoustic speech interval detection method and device 有权
Title translation: 基于谐波结构的声学语音间隔检测方法及装置

公开(公告)号：US07567900B2

公开(公告)日：2009-07-28

申请号：US10542931

申请日：2004-06-03

Applicant: Tetsu Suzuki , Takeo Kanamori , Takashi Kawamura

Inventor： Tetsu Suzuki , Takeo Kanamori , Takashi Kawamura

IPC: G10L15/20 , G10L11/06 , G10L15/00 , G10L17/00

CPC classification number: G10L25/78 , G10L2025/932 , G10L2025/937

Abstract: A harmonic structure acoustic signal detection device not depending on the level fluctuation of the input signal including: an FFT unit which performs FFT on an input signal and calculates a power spectrum component for each frame; a harmonic structure extraction unit which leaves only a harmonic structure from the power spectrum component; a voiced feature evaluation unit which evaluates correlation between the frames of harmonic structures extracted by the harmonic structure extraction unit, thereby evaluates whether or not the segment is a vowel segment, and extracts the voiced segment; and a speech segment determination unit which determines a speech segment according to the continuity and durability of the output of the voiced feature evaluation unit.

Abstract translation: 一种谐波结构声信号检测装置，其不依赖于输入信号的电平波动，包括：FFT单元，对输入信号执行FFT并计算每帧的功率谱分量; 谐波结构提取单元，仅从功率谱分量中留下谐波结构; 声音特征评估单元，其评估由所述谐波结构提取单元提取的谐波结构的相关性，从而评估所述段是否是元音段，并且提取所述有声段; 以及语音段确定单元，其根据有声特征评估单元的输出的连续性和耐久性来确定语音段。

10.

发明授权
Voiced/unvoiced decision based on frequency band ratio 失效
Title translation: 基于频带比的发声/清音决定

公开(公告)号：US5960388A

公开(公告)日：1999-09-28

申请号：US871335

申请日：1997-06-09

Applicant: Masayuki Nishiguchi , Jun Matsumoto , Shinobu Ono

Inventor： Masayuki Nishiguchi , Jun Matsumoto , Shinobu Ono

IPC: G10L19/02 , G10L19/038 , G10L19/04 , G10L19/10 , G10L19/12 , G10L19/18 , G10L25/27 , G10L25/90 , G10L25/93 , G10L7/00

CPC classification number: G10L25/90 , G10L19/0212 , G10L19/038 , G10L19/12 , G10L19/18 , G10L25/93 , G10L19/04 , G10L19/10 , G10L2025/937 , G10L25/27

Abstract: Input audio signal is divided on a block-by-block basis. Frequency domain conversion is done on each of the blocks. Voiced bands of the frequency domain data for one of the blocks are searched for a voiced band B.sub.VH with the highest center frequency if it is decided that there are one or more shift points of voiced (V)/unvoiced (UV) decision data of all the bands. The number N.sub.V of voiced bands having center frequency less than that of the band B.sub.VH is found, so as to decide whether a proportion of the voiced bands is equal to or higher than a predetermined threshold N.sub.th, thereby deciding one V/UV boundary point. Thus, it is possible to replace the V/UV decision data for each band by information on one demarcation in all bands, thereby reducing data volume and bit rate.

Abstract translation: 输入音频信号是逐块分割的。每个块都进行频域转换。如果确定存在所有的有声（V）/无声（UV）判定数据的一个或多个移位点，则针对一个块的频域数据的有声频带搜索具有最高中心频率的有声波段BVH 乐队。找到具有小于频带BVH的中心频率的有声频带的数量NV，以便确定有声频带的比例是否等于或高于预定阈值Nth，从而确定一个V / UV边界点。因此，可以通过关于所有频带中的一个分界的信息来替换每个频带的V / UV判定数据，从而减少数据量和比特率。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification