MULTIPLE-PLAY MULTI-ARMED BANDITS METHOD AND APPARATUS FOR ENSURING EXPOSURE FAIRNESS OF ITEMS

    公开(公告)号:US20250068691A1

    公开(公告)日:2025-02-27

    申请号:US18799915

    申请日:2024-08-09

    Abstract: The present disclosure relates to a multi-armed bandit method and apparatus for selecting multiple items while ensuring fairness of exposure of the multiple items and maximizing the averaged total reward. The MAB method includes: initializing the empirical mean reward and number of arm selections of each arm for the M arms, and the time step; incrementing the time step; calculating the UCB index of each arm for the M arms; selecting K−1 arms with the K−1 highest UCB indices calculated; calculating unfairness indices for the unchosen M−(K−1) arms; checking if there is an arm with a positive unfairness index among the unchosen M−(K−1) arms; selecting the remaining single arm depending on whether there is an arm with a positive unfairness index among the unchosen M−(K−1) arms; playing the selected K arms; and updating the empirical mean reward and the number of arm selections for the played arms.

Patent Agency Ranking