Patent search ap:("Google LLC") AND inv:"Marc Gendron-Bellemare" Page 1

1.

发明申请
CONTRASTIVE BEHAVIORAL SIMILARITY EMBEDDINGS FOR GENERALIZATION IN REINFORCEMENT LEARNING 有权

公开(公告)号：US20230102544A1

公开(公告)日：2023-03-30

申请号：US17487769

申请日：2021-09-28

Applicant: Google LLC

Inventor： Rishabh Agarwal , Marlos Cholodovskis Machado , Pablo Samuel Castro Rivadeneira , Marc Gendron-Bellemare

IPC: G06N3/08 , G06N3/04

Abstract: Approaches are described for training an action selection neural network system for use in controlling an agent interacting with an environment to perform a task, using a contrastive loss function based on a policy similarity metric. In one aspect, a method includes: obtaining a first observation of a first training environment; obtaining a plurality of second observations of a second training environment; for each second observation, determining a respective policy similarity metric between the second observation and the first observation; processing the first observation and the second observations using the representation neural network to generate a first representation of the first training observation and a respective second representation of each second training observation; and training the representation neural network on a contrastive loss function computed using the policy similarity metrics and the first and second representations.

Patent Agency Ranking