-
公开(公告)号:US20250140242A1
公开(公告)日:2025-05-01
申请号:US18385749
申请日:2023-10-31
Applicant: Lemon Inc.
Inventor: Zongyu Yin , Qingqing Huang , Janne Jayne Harm Renee Spijkervet
Abstract: The present disclosure describes techniques for generating audio representations using a machine learning model. A machine learning model is pre-trained using unlabeled audio data. The pre-training enables the machine learning model to recognize audio patterns and generate initial audio representations. The machine learning model is refined by a task-specific fine-tuning process using labeled data. The task-specific fine-tuning process incorporates multi-task learning heads to optimize the machine learning model. The task-specific fine-tuning process enables the machine learning model to be specialized in specific audio tasks and generate continuous audio representations. The continuous audio representations retain acoustic nuances and subtleties of audio signals. The machine learning model is configured and enabled to generate quantized audio representations by incorporating vector quantization to the task-specific fine-tuning process.