CONTRASTIVE BEHAVIORAL SIMILARITY EMBEDDINGS FOR GENERALIZATION IN REINFORCEMENT LEARNING

    公开(公告)号:US20230102544A1

    公开(公告)日:2023-03-30

    申请号:US17487769

    申请日:2021-09-28

    Applicant: Google LLC

    Abstract: Approaches are described for training an action selection neural network system for use in controlling an agent interacting with an environment to perform a task, using a contrastive loss function based on a policy similarity metric. In one aspect, a method includes: obtaining a first observation of a first training environment; obtaining a plurality of second observations of a second training environment; for each second observation, determining a respective policy similarity metric between the second observation and the first observation; processing the first observation and the second observations using the representation neural network to generate a first representation of the first training observation and a respective second representation of each second training observation; and training the representation neural network on a contrastive loss function computed using the policy similarity metrics and the first and second representations.

Patent Agency Ranking