CLUSTERING AUDIO OBJECTS
    41.
    发明公开

    公开(公告)号:US20240187807A1

    公开(公告)日:2024-06-06

    申请号:US18547006

    申请日:2022-02-15

    Inventor: Ziyu Yang Lie Lu

    CPC classification number: H04S7/30 H04S2400/11

    Abstract: A method for clustering audio objects may involve identifying a plurality of audio objects, wherein each audio object of the plurality of audio objects is associated with respective metadata that indicates respective spatial position information and respective rendering metadata. The method may involve assigning audio objects of the plurality of audio objects to categories of rendering metadata of a plurality of categories of rendering metadata, wherein at least one category of rendering metadata comprises a plurality of types of rendering metadata to be preserved. The method may involve determining an allocation of a plurality of audio object clusters to each category of rendering metadata. The method may involve rendering audio objects of the plurality of audio objects to an allocated plurality of audio object clusters based on the metadata that indicates spatial position information and based on the assignments of the audio objects to the categories of rendering metadata.

    Adaptive loudness normalization for audio object clustering

    公开(公告)号:US11930347B2

    公开(公告)日:2024-03-12

    申请号:US17427665

    申请日:2020-02-12

    Inventor: Lianwu Chen Lie Lu

    CPC classification number: H04S7/30 H04S2400/13

    Abstract: A method of processing audio content including a plurality of audio elements comprises: clustering the plurality of audio elements into a plurality of clusters of audio elements; and for a cluster among the plurality of clusters: for each audio element in the cluster, determining a measure of energy that the audio element contributes to the cluster; for at least one audio element in the cluster, determining a compensation gain based at least in part on the measures of energy for the audio elements in the cluster; and applying the compensation gain to the at least one audio element in the cluster.

    METHOD AND APPARATUS FOR AUDIO PROCESSING USING A CONVOLUTIONAL NEURAL NETWORK ARCHITECTURE

    公开(公告)号:US20230401429A1

    公开(公告)日:2023-12-14

    申请号:US18032322

    申请日:2021-10-19

    CPC classification number: G06N3/0464 G10L21/00

    Abstract: Systems, methods, and computer program products for audio processing based on convolutional neural network (CNN) are described. A first CNN architecture may comprise a contracting path of a U-net, a multi-scale CNN, and an expansive path of a U-net. The contracting path may comprise a first encoding layer and may be configured to generate an output representation of the contracting path. The multi-scale CNN may be configured to generate, based on the output representation of the contracting path, an intermediate representation. The multi-scale CNN may comprise at least two parallel convolution paths. The expansive path may comprise a first decoding layer and may be configured to generate a final representation based on the intermediate representation generated by the multi-scale CNN. Within a second CNN architecture, the first encoding layer may comprise a first multi-scale CNN with at least two parallel convolution paths, and the first decoding layer may comprise a second multi-scale CNN with at least two parallel convolution paths.

    Decomposing audio signals
    45.
    发明授权

    公开(公告)号:US10885923B2

    公开(公告)日:2021-01-05

    申请号:US16869477

    申请日:2020-05-07

    Inventor: Jun Wang Lie Lu

    Abstract: Example embodiments disclosed herein relate to signal processing. A method for decomposing a plurality of audio signals from at least two different channels is disclosed. The method comprises obtaining a set of components that are weakly correlated, the set of components generated based on the plurality of audio signals. The method comprises extracting a feature from the set of components, and determining a set of gains associated with the set of components at least in part based on the extracted feature, each of the gains indicating a proportion of a diffuse part in the associated component. The method further comprises decomposing the plurality of audio signals by applying the set of gains to the set of components. Corresponding system and computer program product are also disclosed.

    Audio source separation
    46.
    发明授权

    公开(公告)号:US10818302B2

    公开(公告)日:2020-10-27

    申请号:US16561836

    申请日:2019-09-05

    Abstract: The present document describes a method for extracting J audio sources from I audio channels. The method includes updating a Wiener filter matrix based on a mixing matrix from a source matrix and based on a power matrix of the J audio sources. Furthermore, the method includes updating a cross-covariance matrix of the I audio channels and of the J audio sources and an auto-covariance matrix of the J audio sources, based on the updated Wiener filter matrix and based on an auto-covariance matrix of the I audio channels. In addition, the method includes updating the mixing matrix and the power matrix based on the updated cross-covariance matrix of the I audio channels and of the J audio sources, and/or based on the updated auto-covariance matrix of the J audio sources.

    Upmixing of audio signals
    47.
    发明授权

    公开(公告)号:US10362426B2

    公开(公告)日:2019-07-23

    申请号:US15538892

    申请日:2016-02-09

    Abstract: Example embodiments disclosed herein relates to upmixing of audio signals. A method of upmixing an audio signal is described. The method includes decomposing the audio signal into a diffuse signal and a direct signal, generating an audio bed at least in part based on the diffuse signal, the audio bed including a height channel, extracting an audio object from the direct signal, estimating metadata of the audio object, the metadata including height information of the audio object; and rendering the audio bed and the audio object as an upmixed audio signal, wherein the audio bed is rendered to a predefined position and the audio object is rendered according to the metadata. Corresponding system and computer program product are described as well.

    Processing object-based audio signals

    公开(公告)号:US10277997B2

    公开(公告)日:2019-04-30

    申请号:US15749750

    申请日:2016-08-04

    Abstract: Example embodiments disclosed herein relate to audio signal processing. The audio signal has multiple audio objects. A method of processing an audio signal is disclosed. The method includes obtaining an object position for each of the audio objects; and determining cluster positions for grouping the audio objects into clusters based on the object positions, a plurality of object-to-cluster gains, and a set of metrics. The metrics indicate a quality of the cluster positions and a quality of the object-to-cluster gains, each of the cluster positions is a centroid of a respective one of the clusters, and one of the object-to-cluster gains defines a ratio of the respective audio object in one of the clusters. The method also includes determining the object-to-cluster gains based on the object positions, the cluster positions and the set of metrics; and generating a cluster signal based on the determined cluster positions and object-to-cluster gains. Corresponding system and computer program product are also disclosed.

    Projection-based audio object extraction from audio content

    公开(公告)号:US10275685B2

    公开(公告)日:2019-04-30

    申请号:US15538306

    申请日:2015-12-18

    Abstract: A method is disclosed for audio object extraction from an audio content which includes identifying a first set of projection spaces including a first subset for a first channel and a second subset for a second channel of the plurality of channels. The method may further include determining a first set of correlations between the first and second channels, each of the first set of correlations corresponding to one of the first subset of projection spaces and one of the second subset of projection spaces. Still further, the method may include extracting an audio object from an audio signal of the first channel at least in part based on a first correlation among the first set of correlations and the projection space from the first subset corresponding to the first correlation, the first correlation being greater than a first predefined threshold. Corresponding system and computer program products are also disclosed.

    Processing Object-Based Audio Signals
    50.
    发明申请

    公开(公告)号:US20180227691A1

    公开(公告)日:2018-08-09

    申请号:US15749750

    申请日:2016-08-04

    CPC classification number: H04S3/008 G10L19/008 H04R3/12 H04S2400/11

    Abstract: Example embodiments disclosed herein relate to audio signal processing. The audio signal has multiple audio objects. A method of processing an audio signal is disclosed. The method includes obtaining an object position for each of the audio objects; and determining cluster positions for grouping the audio objects into clusters based on the object positions, a plurality of object-to-cluster gains, and a set of metrics. The metrics indicate a quality of the cluster positions and a quality of the object-to-cluster gains, each of the cluster positions is a centroid of a respective one of the clusters, and one of the object-to-cluster gains defines a ratio of the respective audio object in one of the clusters. The method also includes determining the object-to-cluster gains based on the object positions, the cluster positions and the set of metrics; and generating a cluster signal based on the determined cluster positions and object-to-cluster gains. Corresponding system and computer program product are also disclosed.

Patent Agency Ranking