Patent search ap:("Google LLC") AND inv:"Rishabh Agarwal" Page 1

1.

发明公开
TRAINING NEURAL NETWORKS BY RESETTING DORMANT NEURONS 审中-公开

公开(公告)号：US20240256873A1

公开(公告)日：2024-08-01

申请号：US18424633

申请日：2024-01-26

Applicant: Google LLC

Inventor： Utku Evci , Pablo Samuel Castro Rivadeneira , Ghada AbdElRahman Zaki Nabawy Sokar , Rishabh Agarwal

IPC: G06N3/082 , G06N3/092

CPC classification number: G06N3/082 , G06N3/092

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network and, during the training, resetting neurons that have been classified as being dormant.

2.

发明申请
LEARNING POLICIES USING SPARSE AND UNDERSPECIFIED REWARDS 有权

公开(公告)号：US20210256313A1

公开(公告)日：2021-08-19

申请号：US17180682

申请日：2021-02-19

Applicant: Google LLC

Inventor： Rishabh Agarwal , Chen Liang , Dale Eric Schuurmans , Mohammad Norouzi

IPC: G06K9/62 , G06N3/08

Abstract: Methods and systems for learning policies using sparse and underspecified rewards. One of the methods includes training the policy jointly with an auxiliary reward function having a plurality of auxiliary reward parameters, the auxiliary reward function being configured to map, in accordance with the auxiliary reward parameters, trajectory features of at least a trajectory to an auxiliary reward value that indicates how well the trajectory performed a task in response to a context input.

3.

发明申请
CONTRASTIVE BEHAVIORAL SIMILARITY EMBEDDINGS FOR GENERALIZATION IN REINFORCEMENT LEARNING 有权

公开(公告)号：US20230102544A1

公开(公告)日：2023-03-30

申请号：US17487769

申请日：2021-09-28

Applicant: Google LLC

Inventor： Rishabh Agarwal , Marlos Cholodovskis Machado , Pablo Samuel Castro Rivadeneira , Marc Gendron-Bellemare

IPC: G06N3/08 , G06N3/04

Abstract: Approaches are described for training an action selection neural network system for use in controlling an agent interacting with an environment to perform a task, using a contrastive loss function based on a policy similarity metric. In one aspect, a method includes: obtaining a first observation of a first training environment; obtaining a plurality of second observations of a second training environment; for each second observation, determining a respective policy similarity metric between the second observation and the first observation; processing the first observation and the second observations using the representation neural network to generate a first representation of the first training observation and a respective second representation of each second training observation; and training the representation neural network on a contrastive loss function computed using the policy similarity metrics and the first and second representations.

4.

发明申请
Efficient Knowledge Distillation Framework for Training Machine-Learned Models 有权

公开(公告)号：US20250124256A1

公开(公告)日：2025-04-17

申请号：US18486792

申请日：2023-10-13

Applicant: Google LLC

Inventor： Rishabh Agarwal , Nino Jean Vieillard , Matthieu Florent Geist , Olivier Frédéric Bachem

IPC: G06N3/0455 , G06N3/092

Abstract: An example method is provided for training a machine-learned student sequence processing model, the method comprising: obtaining a respective input; obtaining, from the student machine-learned sequence processing model, a respective output corresponding to the respective input; generating a multiscale refinement objective configured to jointly distill knowledge from a teacher machine-learned sequence processing model and reinforce preferred behavior of the student machine-learned sequence processing model, wherein the multiscale refinement objective comprises: a first component based on a divergence metric characterizing, for the respective input, a comparison of a plurality of predictions of the student machine-learned sequence processing model to a plurality of predictions of the teacher machine-learned sequence processing model; and a second component based on a reinforcement learning signal associated with the respective output; and updating the machine-learned student sequence processing model based on the multiscale refinement objective.

Patent Agency Ranking