PERSONALIZED HYPERPARAMETER TUNING WITH CONTEXTUAL MULTI-ARM BANDIT AND REINFORCEMENT LEARNING

    公开(公告)号:US20250053853A1

    公开(公告)日:2025-02-13

    申请号:US18232468

    申请日:2023-08-10

    Applicant: ROKU, INC.

    Abstract: Disclosed are system, method and/or computer program product embodiments for improving the performance of a machine learning based algorithm used to provide a user experience to a user via a media device. An embodiment selects a first set of hyperparameter values, implements a first iteration of the algorithm based on the first set of hyperparameter values, utilizes the first iteration of the algorithm to provide a first user experience to the user, determines a response of the user to the first user experience, selects, by a hyperparameter tuning ML model implemented as a contextual multi-arm bandit model or a reinforcement learning model and based on at least the response of the user, a second set of hyperparameter values, implements a second iteration of the algorithm based on the second set of hyperparameter values, and utilizes the second iteration of the algorithm to provide a second user experience to the user.

Patent Agency Ranking