NEURAL NETWORK LAYER FOR NON-LINEAR NORMALIZATION

    公开(公告)号:US20230368007A1

    公开(公告)日:2023-11-16

    申请号:US18295791

    申请日:2023-04-04

    申请人: Robert Bosch GmbH

    IPC分类号: G06N3/0499

    CPC分类号: G06N3/0499

    摘要: A computer-implemented machine learning system. The machine learning system is configured to provide an output signal based on an input signal by forwarding the input signal through a plurality of layers of the machine learning system. At least one of the layers of the plurality of layers is configured to receive a layer input, which is based on the input signal, and to provide a layer output based on which the output signal is determined. The layer is configured to determine the layer output by means of a non-linear normalization of the layer input.

    TIME-VARYING AND NONLINEAR AUDIO PROCESSING USING DEEP NEURAL NETWORKS

    公开(公告)号:US20230197043A1

    公开(公告)日:2023-06-22

    申请号:US17924701

    申请日:2020-05-12

    摘要: A computer-implemented method of processing audio data, the method comprising receiving input audio data (x) comprising a time-series of amplitude values; transforming the input audio data (x) into an input frequency band decomposition (X1) of the input audio data (x); transforming the input frequency band decomposition (X1) into a first latent representation (Z); processing the first latent representation (Z) by a first deep neural network to obtain a second latent representation (Z{circumflex over ( )}, Z1{circumflex over ( )}); transforming the second latent representation (Z{circumflex over ( )}, Z1{circumflex over ( )}) to obtain a discrete approximation (X3{circumflex over ( )}); element-wise multiplying the discrete approximation (X3{circumflex over ( )}) and a residual feature map (R, X5{circumflex over ( )}) to obtain a modified feature map, wherein the residual feature map (R, X5{circumflex over ( )}) is derived from the input frequency band decomposition (X1); processing a pre-shaped frequency band decomposition by a waveshaping unit to obtain a waveshaped frequency band decomposition (X1{circumflex over ( )}, X1.2{circumflex over ( )}), wherein the pre-shaped frequency band decomposition is derived from the input frequency band decomposition (X1), wherein the waveshaping unit comprises a second deep neural network; summing the waveshaped frequency band decomposition (X1{circumflex over ( )}, X1.2{circumflex over ( )}) and a modified frequency band decomposition (X2{circumflex over ( )}, X1.1{circumflex over ( )}) to obtain a summation output (X0{circumflex over ( )}), wherein the modified frequency band decomposition (X2{circumflex over ( )}, X1.1{circumflex over ( )}) is derived from the modified feature map; and transforming the summation output (X0{circumflex over ( )}) to obtain target audio data (y{circumflex over ( )}).