-
公开(公告)号:US20220022000A1
公开(公告)日:2022-01-20
申请号:US17292457
申请日:2019-11-12
发明人: Stefan Bruhn , Juan Felix Torres , David S. McGrath , Brian Lee
摘要: The disclosure herein generally relates to capturing, acoustic pre-processing, encoding, decoding, and rendering of directional audio of an audio scene. In particular, it relates to a device adapted to modify a directional property of a captured directional audio in response to spatial data of a microphone system capturing the directional audio. The disclosure further relates to a rendering device configured to modify a directional property of a received directional audio in response to received spatial data.
-
公开(公告)号:US12014745B2
公开(公告)日:2024-06-18
申请号:US17882900
申请日:2022-08-08
IPC分类号: G10L19/008 , H04S3/00
CPC分类号: G10L19/008 , H04S3/008 , H04S2400/01
摘要: The disclosed embodiments enable converting audio signals captured in various formats by various capture devices into a limited number of formats that can be processed by an audio codec (e.g., an Immersive Voice and Audio Services (IVAS) codec). In an embodiment, a simplification unit of the audio device receives an audio signal captured by one or more audio capture devices coupled to the audio device. The simplification unit determines whether the audio signal is in a format that is supported/not supported by an encoding unit of the audio device. Based on the determining, the simplification unit, converts the audio signal into a format that is supported by the encoding unit. In an embodiment, if the simplification unit determines that the audio signal is in a spatial format, the simplification unit can convert the audio signal into a spatial “mezzanine” format supported by the encoding.
-
公开(公告)号:US20240153512A1
公开(公告)日:2024-05-09
申请号:US18548817
申请日:2022-03-08
发明人: Panji Setiawan , Rishabh Tyagi , Stefan Bruhn
IPC分类号: G10L19/008 , G10L19/005
CPC分类号: G10L19/008 , G10L19/005 , G10L19/002
摘要: A method for performing gain control on audio signals is provided. In some implementations, the method involves determining downmixed signals associated with one or more downmix channels associated with a current frame of an audio signal to be encoded. In some implementations, the method involves determining whether an overload condition exists for an encoder. In some implementation, the method involves determining a gain parameter. In some implementations, the method involves determining at least one gain transition function based on the gain parameter and a gain parameter associated with a preceding frame of the audio signal. In some implementations, the method involves applying the at least one gain transition function to one or more of the downmixed signals. In some implementations, the method involves encoding the downmixed signals in connection with information indicative of gain control applied to the current frame.
-
4.
公开(公告)号:US12020718B2
公开(公告)日:2024-06-25
申请号:US17251940
申请日:2019-07-02
发明人: Stefan Bruhn , Juan Felix Torres
IPC分类号: G10L19/16 , G10L19/008 , G10L19/18 , H04S3/00
CPC分类号: G10L19/167 , G10L19/008 , G10L19/18
摘要: The present document describes a method (500) for generating a bitstream (101), wherein the bitstream (101) comprises a sequence of superframes (400) for a sequence of frames of an immersive audio signal (111). The method (500) comprises, repeatedly for the sequence of superframes (400), inserting (501) coded audio data (206) for one or more frames of one or more downmix channel signals (203) derived from the immersive audio signal (111), into data fields (411, 421, 412, 422) of a superframe (400); and inserting (502) metadata (202, 205) for reconstructing one or more frames of the immersive audio signal (111) from the coded audio data (206), into a metadata field (403) of the superframe (400).
-
公开(公告)号:US11410666B2
公开(公告)日:2022-08-09
申请号:US16973030
申请日:2019-10-07
IPC分类号: G10L19/008 , H04S3/00
摘要: The disclosed embodiments enable converting audio signals captured in various formats by various capture devices into a limited number of formats that can be processed by an audio codec (e.g., an Immersive Voice and Audio Services (IVAS) codec). In an embodiment, a simplification unit of the audio device receives an audio signal captured by one or more audio capture devices coupled to the audio device. The simplification unit determines whether the audio signal is in a format that is supported/not supported by an encoding unit of the audio device. Based on the determining, the simplification unit, converts the audio signal into a format that is supported by the encoding unit. In an embodiment, if the simplification unit determines that the audio signal is in a spatial format, the simplification unit can convert the audio signal into a spatial “mezzanine” format supported by the encoding.
-
公开(公告)号:US11765536B2
公开(公告)日:2023-09-19
申请号:US17293463
申请日:2019-11-12
发明人: Stefan Bruhn
IPC分类号: H04S3/02
CPC分类号: H04S3/02 , H04S2400/03
摘要: There is provided encoding and decoding methods for representing spatial audio that is a combination of directional sound and diffuse sound. An exemplary encoding method includes inter alia creating a single- or multi-channel downmix audio signal by downmixing input audio signals from a plurality of microphones in an audio capture unit capturing the spatial audio; determining first metadata parameters associated with the downmix audio signal, wherein the first metadata parameters are indicative of one or more of: a relative time delay value, a gain value, and a phase value associated with each input audio signal; and combining the created downmix audio signal and the first metadata parameters into a representation of the spatial audio.
-
公开(公告)号:US11699451B2
公开(公告)日:2023-07-11
申请号:US17251913
申请日:2019-07-02
IPC分类号: G10L19/16 , G10L19/008 , G10L19/18 , H04S3/00
CPC分类号: G10L19/167 , G10L19/008 , G10L19/18
摘要: The present document describes a method (700) for encoding a multi-channel input signal (201). The method (700) comprises determining (701) a plurality of downmix channel signals (203) from the multi-channel input signal (201) and performing (702) energy compaction of the plurality of downmix channel signals (203) to provide a plurality of compacted channel signals (404). Furthermore, the method (700) comprises determining (703) joint coding metadata (205) based on the plurality of compacted channel signals (404) and based on the multi-channel input signal (201), wherein the joint coding metadata (205) is such that it allows upmixing of the plurality of compacted channel signals (404) to an approximation of the multi-channel input signal (201). In addition, the method (700) comprises encoding (704) the plurality of compacted channel signals (404) and the joint coding metadata (205).
-
-
-
-
-
-