-
公开(公告)号:US20190130312A1
公开(公告)日:2019-05-02
申请号:US15885727
申请日:2018-01-31
Applicant: salesforce.com, inc.
Inventor: Caiming XIONG , Tianmin SHU , Richard SOCHER
Abstract: The disclosed technology reveals a hierarchical policy network, for use by a software agent, to accomplish an objective that requires execution of multiple tasks. A terminal policy learned by training the agent on a terminal task set, serves as a base task set of the intermediate task set. An intermediate policy learned by training the agent on an intermediate task set serves as a base policy of the top policy. A top policy learned by training the agent on a top task set serves as a base task set of the top task set. The agent is configurable to accomplish the objective by traversal of the hierarchical policy network. A current task in a current task set is executed by executing a previously-learned task selected from a corresponding base task set governed by a corresponding base policy, or performing a primitive action selected from a library of primitive actions.