-
公开(公告)号:US20240331709A1
公开(公告)日:2024-10-03
申请号:US18355928
申请日:2023-07-20
申请人: GOOGLE LLC
发明人: Sze Chie Lim , Shawn Singh , Anjali Wheeler , Jani Huoponen , Jan Skoglund
IPC分类号: G10L19/02
CPC分类号: G10L19/02
摘要: A method including receiving first audio data, receiving second audio data, compressing the first audio data as first compressed audio data, compressing the second audio data as second compressed audio data, generating a codec dependent container including a parameter associated with compressing the first audio data, compressing the second audio data, a reference to the first compressed audio data, and a reference to the second compressed audio data, generating a codec agnostic container including at least one parameter representing time-varying data associated with playback of the first audio data and the second audio data, and generating an audio package including the codec dependent container and the codec agnostic container.
-
公开(公告)号:USRE50144E1
公开(公告)日:2024-09-24
申请号:US17845503
申请日:2022-06-21
发明人: Markus Schnell , Manfred Lutzky , Markus Lohwasser , Markus Schmidt , Marc Gayer , Michael Mellar , Bernd Edler , Markus Multrus , Gerald Schuller , Ralf Geiger , Bernhard Grill
IPC分类号: G10L19/00 , G10L19/02 , G10L19/022 , G10L25/45 , H03H17/02 , G10L21/038
CPC分类号: G10L19/0204 , G10L19/022 , G10L25/45 , H03H17/0266 , G10L21/038
摘要: An embodiment of an apparatus for generating audio subband values in audio subband channels includes an analysis windower for windowing a frame of time-domain audio input samples being in a time sequence extending from an early sample to a later sample using an analysis window function including a sequence of window coefficients to obtain windowed samples. The analysis window function includes a first number of window coefficients derived from a larger window function including a sequence of a larger second number of window coefficients, wherein the window coefficients of the window function are derived by an interpolation of window coefficients of the larger window function. The apparatus further includes a calculator for calculating the audio subband values using the windowed samples.
-
公开(公告)号:USRE50132E1
公开(公告)日:2024-09-17
申请号:US17845465
申请日:2022-06-21
发明人: Markus Schnell , Manfred Lutzky , Markus Lohwasser , Markus Schmidt , Marc Gayer , Michael Mellar , Bernd Edler , Markus Multrus , Gerald Schuller , Ralf Geiger , Bernhard Grill
IPC分类号: G10L19/00 , G10L19/02 , G10L19/022 , G10L25/45 , H03H17/02 , G10L21/038
CPC分类号: G10L19/0204 , G10L19/022 , G10L25/45 , H03H17/0266 , G10L21/038
摘要: An embodiment of an apparatus for generating audio subband values in audio subband channels includes an analysis windower for windowing a frame of time-domain audio input samples being in a time sequence extending from an early sample to a later sample using an analysis window function including a sequence of window coefficients to obtain windowed samples. The analysis window function includes a first number of window coefficients derived from a larger window function including a sequence of a larger second number of window coefficients, wherein the window coefficients of the window function are derived by an interpolation of window coefficients of the larger window function. The apparatus further includes a calculator for calculating the audio subband values using the windowed samples.
-
4.
公开(公告)号:US12087314B2
公开(公告)日:2024-09-10
申请号:US18103871
申请日:2023-01-31
IPC分类号: G10L19/038 , G10L19/02 , G10L19/032 , G10L19/06 , G10L21/038 , G10L19/00
CPC分类号: G10L19/038 , G10L19/0204 , G10L19/032 , G10L19/06 , G10L21/038 , G10L2019/001
摘要: An encoder for encoding a parametric spectral representation (f) of auto-regressive coefficients that partially represent an audio signal. The encoder includes a low-frequency encoder configured to quantize elements of a part of the parametric spectral representation that correspond to a low-frequency part of the audio signal. It also includes a high-frequency encoder configured to encode a high-frequency part (fH) of the parametric spectral representation (f) by weighted averaging based on the quantized elements ({circumflex over (f)}L) flipped around a quantized mirroring frequency ({circumflex over (f)}m), which separates the low-frequency part from the high-frequency part, and a frequency grid determined from a frequency grid codebook in a closed-loop search procedure. Described are also a corresponding decoder, corresponding encoding/decoding methods and UEs including such an encoder/decoder.
-
公开(公告)号:US20240284124A1
公开(公告)日:2024-08-22
申请号:US18648299
申请日:2024-04-26
申请人: GN Hearing A/S
发明人: Changxue MA , Srdjan PETROVIC
CPC分类号: H04R25/43 , G10L17/02 , G10L19/02 , H04R25/407 , H04R25/507 , H04R2225/39 , H04R2225/41 , H04R2225/61
摘要: A hearing device comprises a first microphone for provision of a first microphone input signal, and a second microphone for provision of a second microphone input signal; a voice detector module for processing the first and second microphone input signals, the voice detector module configured to detect own-voice of a user; a processor for provision of an electrical output signal; and a receiver for providing an audio output signal, wherein the voice detector module is configured to determine a direction parameter indicative of a direction of a sound source based on first and/or second microphone input signal; determine whether a direction criterion based on the direction parameter is satisfied; determine a first distance parameter indicative of a distance to the sound source; determine whether a distance criterion based on the first distance parameter is satisfied; and provide a voice detector output.
-
公开(公告)号:US20240282323A1
公开(公告)日:2024-08-22
申请号:US18649738
申请日:2024-04-29
IPC分类号: G10L19/02 , G10L19/008 , H04S7/00
CPC分类号: G10L19/0212 , G10L19/008 , G10L19/0204 , H04S7/308 , H04R2460/03 , H04S2400/01 , H04S2420/01 , H04S2420/03 , H04S2420/07
摘要: A method for representing a second presentation of audio channels or objects as a data stream, the method comprising the steps of: (a) providing a set of base signals, the base signals representing a first presentation of the audio channels or objects; (b) providing a set of transformation parameters, the transformation parameters intended to transform the first presentation into the second presentation; the transformation parameters further being specified for at least two frequency bands and including a set of multi-tap convolution matrix parameters for at least one of the frequency bands.
-
公开(公告)号:US20240274144A1
公开(公告)日:2024-08-15
申请号:US18643717
申请日:2024-04-23
发明人: Yupeng SHI , Wei XIAO , Meng WANG , Yuyong KANG , Qingbo HUANG
CPC分类号: G10L19/06 , G10L19/0204
摘要: Embodiments of this application provide an audio coding method and apparatus, an audio decoding method and apparatus, an electronic device, and a storage medium, applied to an on board scene. The audio decoding method includes obtaining a bitstream of an audio signal; performing label extraction processing on a predicted value of a feature vector of the audio signal associated with the bitstream to obtain a label information vector, a dimension of the label information vector being the same as a dimension of the predicted value of the feature vector; performing signal reconstruction based on the predicted value of the feature vector and the label information vector; and identifying a predicted value of the audio signal obtained by the signal reconstruction as a decoding result of the bitstream.
-
公开(公告)号:US12062379B2
公开(公告)日:2024-08-13
申请号:US18072038
申请日:2022-11-30
发明人: Bingyin Xia , Jiawei Li , Zhe Wang
IPC分类号: G10L19/02 , G10L21/038 , G10L21/0388
CPC分类号: G10L19/02
摘要: An audio coding method includes obtaining a current frame that includes a high-frequency band signal and a low-frequency band signal; performing first coding on the high-frequency band signal and the low-frequency band signal to obtain a first coding parameter; determining a spectrum reservation flag of each frequency bin of the high-frequency band signal, where the spectrum reservation flag indicates whether a first spectrum corresponding to the frequency bin is reserved in a second spectrum corresponding to the frequency bin; and performing second coding on the high-frequency band signal based on the spectrum reservation flag of each frequency bin of the high-frequency band signal to obtain a second coding parameter, where the second coding parameter indicates information about a target tonal component of the high-frequency band signal.
-
9.
公开(公告)号:US20240265928A1
公开(公告)日:2024-08-08
申请号:US18640393
申请日:2024-04-19
发明人: Meng WANG , Wei XIAO , Yuyong KANG , Qingbo HUANG , Yupeng SHI
IPC分类号: G10L19/008 , G10L19/02 , G10L19/032 , G10L25/30
CPC分类号: G10L19/008 , G10L19/0204 , G10L19/032 , G10L25/30
摘要: An audio processing method/apparatus including performing multichannel signal decomposition on an audio signal to obtain N subband signals of the audio signal, the frequency bands of the N subband signals increase sequentially and N is an integer greater than 2, performing signal compression on each subband signal of the N subband signals to obtain a subband signal feature of each subband signal; and performing quantization encoding on the subband signal feature of each subband signal to obtain a bitstream of each subband signal.
-
公开(公告)号:US12050643B2
公开(公告)日:2024-07-30
申请号:US17613068
申请日:2020-04-08
发明人: Kunio Kashino , Shota Ikawa
IPC分类号: G06F16/632 , G06F16/683 , G10L19/02 , G10L19/09
CPC分类号: G06F16/634 , G06F16/683 , G10L19/0204 , G10L19/09
摘要: To provide database generation techniques that can accurately and efficiently generate a database useable in text-based sound signal search. A sound signal database generation apparatus includes: a latent variable generation unit that generates, from a sound signal, a latent variable corresponding to the sound signal using a sound signal encoder; a data generation unit that generates a natural language representation corresponding to the sound signal from the latent variable and a condition concerning an index for a natural language representation using a natural language representation decoder; and a sound signal database generation unit that generates a record including the natural language representation corresponding to the sound signal and the sound signal from the natural language representation corresponding to the sound signal and the sound signal, and generates a sound signal database made up of the record.
-
-
-
-
-
-
-
-
-