ENHANCING TRANSFER LEARNING FOR LARGE LANGUAGE MODELS

    公开(公告)号:US20250045575A1

    公开(公告)日:2025-02-06

    申请号:US18423802

    申请日:2024-01-26

    Applicant: Roku, Inc.

    Abstract: Pre-trained large language models may be trained on a large data set which may not necessarily align with specific tasks, business goals, and requirements. Pre-trained large language models can solve generic semantic relationship or question-answering type problems but may not be suited for content item retrieval or recommendation of content items that are semantically relevant to a query. It is possible to build a machine learning model while using transfer learning to learn from pre-trained large language models. Training data can significantly impact the performance of machine learning models, especially machine learning models developed using transfer learning. The training data can impact a model's performance, generalization, fairness, and adaptation to specific domains. To address some of these concerns, a popularity bucketing strategy can be implemented to debias training data. Optionally, an ensemble of models can be used to generate diverse training data.

Patent Agency Ranking