-
1.
公开(公告)号:US20230237352A1
公开(公告)日:2023-07-27
申请号:US17581782
申请日:2022-01-21
Applicant: salesforce.com, inc.
Inventor: Tian Lan , Stephan Tao Zheng , Sunil Srinivasa
CPC classification number: G06N5/043 , G06F9/545 , G06N20/00 , G06F9/5072
Abstract: Embodiments provide a fast multi-agent reinforcement learning (RL) pipeline that runs the full RL workflow end-to-end on a single GPU, using a single store of data for simulation roll-outs, inference, and training. Specifically, simulations and agents in each simulation are run in tandem, taking advantage of the parallel capabilities of the GPU. This way, the costly GPU-CPU communication and copying is significantly reduced, and simulation sampling and learning rates are in turn improved. In this way, a large number of simulations may be concurrently run on the GPU, thus largely improving efficiency of the RL training.