GENERATING AUDIO REPRESENTATIONS USING MACHINE LEARNING MODEL

    公开(公告)号:US20250140242A1

    公开(公告)日:2025-05-01

    申请号:US18385749

    申请日:2023-10-31

    Applicant: Lemon Inc.

    Abstract: The present disclosure describes techniques for generating audio representations using a machine learning model. A machine learning model is pre-trained using unlabeled audio data. The pre-training enables the machine learning model to recognize audio patterns and generate initial audio representations. The machine learning model is refined by a task-specific fine-tuning process using labeled data. The task-specific fine-tuning process incorporates multi-task learning heads to optimize the machine learning model. The task-specific fine-tuning process enables the machine learning model to be specialized in specific audio tasks and generate continuous audio representations. The continuous audio representations retain acoustic nuances and subtleties of audio signals. The machine learning model is configured and enabled to generate quantized audio representations by incorporating vector quantization to the task-specific fine-tuning process.

Patent Agency Ranking