Noise floor estimation and noise reduction

    公开(公告)号:US12033649B2

    公开(公告)日:2024-07-09

    申请号:US17793539

    申请日:2021-01-18

    IPC分类号: G10L21/02

    CPC分类号: G10L21/02

    摘要: Embodiments are disclosed for noise floor estimation and noise reduction, In an embodiment, a method comprises: obtaining an audio signal; dividing the audio signal into a plurality of buffers; determining time-frequency samples for each buffer of the audio signal; for each buffer and for each frequency, determining a median (or mean) and a measure of an amount of variation of energy based on the samples in the buffer and samples in neighboring buffers that together span a specified time range of the audio signal; combining the median (or mean) and the measure of the amount of variation of energy into a cost function; for each frequency: determining a signal energy of a particular buffer of the audio signal that corresponds to a minimum value of the cost function; selecting the signal energy as the estimated noise floor of the audio signal; and reducing, using the estimated noise floor, noise in the audio signal.

    AUDIO QUALITY CONVERSION DEVICE AND CONTROL METHOD THEREFOR

    公开(公告)号:US20240212699A1

    公开(公告)日:2024-06-27

    申请号:US18568678

    申请日:2022-06-09

    申请人: COCHL.INC.

    IPC分类号: G10L21/02 G10L25/30

    CPC分类号: G10L21/02 G10L25/30

    摘要: An audio quality conversion device according to the present invention includes: a control unit having, mounted therein, an artificial neural network that learns using a plurality of pieces of audio data recorded in recording environments differing with respect to a predetermined audio event, and environmental data related to the recording environments corresponding to respective audio data; and an audio input unit receiving outside sounds to generate audio recording data, wherein the control unit converts, on the basis of a learning result of the artificial neural network, the audio recording data generated by means of the audio input unit.

    RELEVANCE BASED SOURCE SELECTION FOR FAR-FIELD VOICE SYSTEMS

    公开(公告)号:US20240194189A1

    公开(公告)日:2024-06-13

    申请号:US18077180

    申请日:2022-12-07

    IPC分类号: G10L15/08 G10L15/02 G10L21/02

    CPC分类号: G10L15/08 G10L15/02 G10L21/02

    摘要: An electronic device includes a far-field voice (FFV) processor including a source selection module. The source selection module receives a set of audio signals and determines, for each audio stream, whether the audio stream is relevant to an application. The source selection module receives several separate probability computations, with each probability computation providing a probability of the presence of a particular characteristic. Additionally, the source selection module receives one or more applications as well relevance information (e.g., one or relevant characteristics) associated with the one or applications. The source selection module can used respective probabilities to determine if one or more characteristics are present in an audio signal, and compare the characteristic(s) to the relevance information for the application. Using this information, the source selection module can determine, for each audio signal, to which respective application the audio stream is relevant.

    Trained generative model speech coding

    公开(公告)号:US11978464B2

    公开(公告)日:2024-05-07

    申请号:US17757122

    申请日:2021-01-22

    申请人: GOOGLE LLC

    摘要: A method includes receiving sampled audio data corresponding to utterances and training a machine learning (ML) model, using the sampled audio data, to generate a high-fidelity audio stream from a low bitrate input bitstream. The training of the ML model includes de-emphasizing the influence of low-probability distortion events in the sampled audio data on the trained ML model, where the de-emphasizing of the distortion events is achieved by the inclusion of a term in an objective function of the ML model, which term encourages low-variance predictive distributions of a next sample in the sampled audio data, based on previous samples of the audio data.