-
1.
公开(公告)号:US20250068691A1
公开(公告)日:2025-02-27
申请号:US18799915
申请日:2024-08-09
Inventor: Youngmi JIN , Dong Deok KIM , Young Joo SUH
IPC: G06F17/11
Abstract: The present disclosure relates to a multi-armed bandit method and apparatus for selecting multiple items while ensuring fairness of exposure of the multiple items and maximizing the averaged total reward. The MAB method includes: initializing the empirical mean reward and number of arm selections of each arm for the M arms, and the time step; incrementing the time step; calculating the UCB index of each arm for the M arms; selecting K−1 arms with the K−1 highest UCB indices calculated; calculating unfairness indices for the unchosen M−(K−1) arms; checking if there is an arm with a positive unfairness index among the unchosen M−(K−1) arms; selecting the remaining single arm depending on whether there is an arm with a positive unfairness index among the unchosen M−(K−1) arms; playing the selected K arms; and updating the empirical mean reward and the number of arm selections for the played arms.