Patent search ap:("Google LLC") AND inv:"Ofir Nachum" Page 1

1.

发明公开
Mixture-Of-Expert Approach to Reinforcement Learning-Based Dialogue Management 审中-公开

公开(公告)号：US20230376697A1

公开(公告)日：2023-11-23

申请号：US18173495

申请日：2023-02-23

Applicant: Google LLC

Inventor： Yinlam Chow , Ofir Nachum , Azamat Tulepbergenov

IPC: G06F40/35 , G06N3/092 , G06F40/126

CPC classification number: G06F40/35 , G06N3/092 , G06F40/126

Abstract: Systems and methods for dialogue response prediction can leverage a plurality of machine-learned language models to generate a plurality of candidate outputs, which can be processed by a dialogue management model to determine a predicted dialogue response. The plurality of machine-learned language models can include a plurality of experts trained on different intents, emotions, and/or tasks. The particular candidate output selected may be selected by the dialogue management model based on semantics determined based on a language representation. The language representation can be a representation generated by processing the conversation history of a conversation to determine conversation semantics.

2.

发明授权
Training policy neural networks using path consistency learning 有权

公开(公告)号：US11429844B2

公开(公告)日：2022-08-30

申请号：US16904785

申请日：2020-06-18

Applicant: Google LLC

Inventor： Ofir Nachum , Mohammad Norouzi , Dale Eric Schuurmans , Kelvin Xu

IPC: G06N3/04 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a policy neural network used to select actions to be performed by a reinforcement learning agent interacting with an environment. In one aspect, a method includes obtaining path data defining a path through the environment traversed by the agent. A consistency error is determined for the path from a combined reward, first and last soft-max state values, and a path likelihood. A value update for the current values of the policy neural network parameters is determined from at least the consistency error. The value update is used to adjust the current values of the policy neural network parameters.

3.

发明申请
LEARNING NEURAL NETWORK STRUCTURE 有权

公开(公告)号：US20220215263A1

公开(公告)日：2022-07-07

申请号：US17701778

申请日：2022-03-23

Applicant: Google LLC

Inventor： Ofir Nachum , Ariel Gordon , Elad Eban , Bo Chen

IPC: G06N3/08 , G06N3/04 , G06N20/00

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training neural networks. In one aspect, a system includes a neural network shrinking engine that is configured to receive a neural network being trained and generate a reduced neural network by a shrinking process. The shrinking process includes training the neural network based on a shrinking engine loss function that includes terms penalizing active neurons of the neural network and removing inactive neurons from the neural network. The system includes a neural network expansion engine that is configured to receive the neural network being trained and generate an expanded neural network by an expansion process including adding new neurons to the neural network and training the neural network based on an expanding engine loss function. The system includes a training subsystem that generates reduced neural networks and expanded neural networks.

4.

发明授权
Learning neural network structure 有权

公开(公告)号：US11315019B2

公开(公告)日：2022-04-26

申请号：US15813961

申请日：2017-11-15

Applicant: Google LLC

Inventor： Ofir Nachum , Ariel Gordon , Elad Eban , Bo Chen

IPC: G06N3/08 , G06N3/04 , G06N20/00

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training neural networks. In one aspect, a system includes a neural network shrinking engine that is configured to receive a neural network being trained and generate a reduced neural network by a shrinking process. The shrinking process includes training the neural network based on a shrinking engine loss function that includes terms penalizing active neurons of the neural network and removing inactive neurons from the neural network. The system includes a neural network expansion engine that is configured to receive the neural network being trained and generate an expanded neural network by an expansion process including adding new neurons to the neural network and training the neural network based on an expanding engine loss function. The system includes a training subsystem that generates reduced neural networks and expanded neural networks.

5.

发明授权
Training policy neural networks using path consistency learning 有权

公开(公告)号：US10733502B2

公开(公告)日：2020-08-04

申请号：US16504934

申请日：2019-07-08

Applicant: Google LLC

Inventor： Ofir Nachum , Mohammad Norouzi , Dale Eric Schuurmans , Kelvin Xu

IPC: G06N3/04 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a policy neural network used to select actions to be performed by a reinforcement learning agent interacting with an environment. In one aspect, a method includes obtaining path data defining a path through the environment traversed by the agent. A consistency error is determined for the path from a combined reward, first and last soft-max state values, and a path likelihood. A value update for the current values of the policy neural network parameters is determined from at least the consistency error. The value update is used to adjust the current values of the policy neural network parameters.

6.

发明申请
TRAINING POLICY NEURAL NETWORKS USING PATH CONSISTENCY LEARNING 审中-公开

公开(公告)号：US20190332922A1

公开(公告)日：2019-10-31

申请号：US16504934

申请日：2019-07-08

Applicant: Google LLC

Inventor： Ofir Nachum , Mohammad Norouzi , Dale Eric Schuurmans , Kelvin Xu

IPC: G06N3/04 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a policy neural network used to select actions to be performed by a reinforcement learning agent interacting with an environment. In one aspect, a method includes obtaining path data defining a path through the environment traversed by the agent. A consistency error is determined for the path from a combined reward, first and last soft-max state values, and a path likelihood. A value update for the current values of the policy neural network parameters is determined from at least the consistency error. The value update is used to adjust the current values of the policy neural network parameters.

7.

发明授权
Learning neural network structure 有权

公开(公告)号：US11875262B2

公开(公告)日：2024-01-16

申请号：US17701778

申请日：2022-03-23

Applicant: Google LLC

Inventor： Ofir Nachum , Ariel Gordon , Elad Eban , Bo Chen

IPC: G06N3/082 , G06N3/084 , G06N3/045 , G06N3/047 , G06N20/00

CPC classification number: G06N3/082 , G06N3/045 , G06N3/047 , G06N3/084 , G06N20/00

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training neural networks. In one aspect, a system includes a neural network shrinking engine that is configured to receive a neural network being trained and generate a reduced neural network by a shrinking process. The shrinking process includes training the neural network based on a shrinking engine loss function that includes terms penalizing active neurons of the neural network and removing inactive neurons from the neural network. The system includes a neural network expansion engine that is configured to receive the neural network being trained and generate an expanded neural network by an expansion process including adding new neurons to the neural network and training the neural network based on an expanding engine loss function. The system includes a training subsystem that generates reduced neural networks and expanded neural networks.

8.

发明公开
Offline Primitive Discovery For Accelerating Data-Driven Reinforcement Learning 审中-公开

公开(公告)号：US20230367996A1

公开(公告)日：2023-11-16

申请号：US18044852

申请日：2021-09-23

Applicant: Google LLC

Inventor： Anurag Ajay , Ofir Nachum , Aviral Kumar , Sergey Levine

IPC: G06N3/0455 , G06N3/092

CPC classification number: G06N3/0455 , G06N3/092

Abstract: A method includes determining a first state associated with a particular task, and determining, by a task policy model, a latent space representation of the first state. The task policy model may have been trained to define, for each respective state of a plurality of possible states associated with the particular task, a corresponding latent space representation of the respective state. The method also includes determining, by a primitive policy model and based on the first state and the latent space representation of the first state, an action to take as part of the particular task. The primitive policy model may have been trained to define a space of primitive policies for the plurality of possible states associated with the particular task and a plurality of possible latent space representations. The method further includes executing the action to reach a second state associated with the particular task.

9.

发明申请
Identifying and Correcting Label Bias in Machine Learning 有权

公开(公告)号：US20220036203A1

公开(公告)日：2022-02-03

申请号：US17298766

申请日：2019-10-16

Applicant: Google LLC

Inventor： Ofir Nachum , Hanxi Heinrich Jiang

IPC: G06N5/02

Abstract: The present disclosure is directed to systems and methods for identifying and correcting label bias in machine learning via intelligent re-weighting of training examples. In particular, aspects of the present disclosure leverage a problem formulation which assumes the existence of underlying, unknown, and unbiased labels which are overwritten by an agent who intends to provide accurate labels but may have biases towards certain groups. Despite the fact that a biased training dataset provides only observations of the biased labels, the systems and methods described herein can nevertheless correct the bias by re-weighting the data points without changing the labels.

10.

发明申请
TRAINING POLICY NEURAL NETWORKS USING PATH CONSISTENCY LEARNING 审中-公开

公开(公告)号：US20200320372A1

公开(公告)日：2020-10-08

申请号：US16904785

申请日：2020-06-18

Applicant: Google LLC

Inventor： Ofir Nachum , Mohammad Norouzi , Dale Eric Schuurmans , Kelvin Xu

IPC: G06N3/04 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a policy neural network used to select actions to be performed by a reinforcement learning agent interacting with an environment. In one aspect, a method includes obtaining path data defining a path through the environment traversed by the agent. A consistency error is determined for the path from a combined reward, first and last soft-max state values, and a path likelihood. A value update for the current values of the policy neural network parameters is determined from at least the consistency error. The value update is used to adjust the current values of the policy neural network parameters.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification