-
公开(公告)号:US20240046128A1
公开(公告)日:2024-02-08
申请号:US18471564
申请日:2023-09-21
Applicant: NEC Laboratories America, Inc.
Inventor: Wenchao Yu , Wei Cheng , Haifeng Chen , Yuncong Chen , Xuchao Zhang , Tianxiang Zhao
IPC: G16H50/20
CPC classification number: G16H50/20
Abstract: A method for learning a self-explainable imitator by discovering causal relationships between states and actions is presented. The method includes obtaining, via an acquisition component, demonstrations of a target task from experts for training a model to generate a learned policy, training the model, via a learning component, the learning component computing actions to be taken with respect to states, generating, via a dynamic causal discovery component, dynamic causal graphs for each environment state, encoding, via a causal encoding component, discovered causal relationships by updating state variable embeddings, and outputting, via an output component, the learned policy including trajectories similar to the demonstrations from the experts.
-
公开(公告)号:US20240054373A1
公开(公告)日:2024-02-15
申请号:US18471570
申请日:2023-09-21
Applicant: NEC Laboratories America, Inc.
Inventor: Wenchao Yu , Wei Cheng , Haifeng Chen , Yuncong Chen , Xuchao Zhang , Tianxiang Zhao
Abstract: A method for learning a self-explainable imitator by discovering causal relationships between states and actions is presented. The method includes obtaining, via an acquisition component, demonstrations of a target task from experts for training a model to generate a learned policy, training the model, via a learning component, the learning component computing actions to be taken with respect to states, generating, via a dynamic causal discovery component, dynamic causal graphs for each environment state, encoding, via a causal encoding component, discovered causal relationships by updating state variable embeddings, and outputting, via an output component, the learned policy including trajectories similar to the demonstrations from the experts.
-
公开(公告)号:US20230080424A1
公开(公告)日:2023-03-16
申请号:US17877081
申请日:2022-07-29
Applicant: NEC Laboratories America, Inc.
Inventor: Wenchao Yu , Wei Cheng , Haifeng Chen , Yuncong Chen , Xuchao Zhang , Tianxiang Zhao
IPC: G06N7/00
Abstract: A method for learning a self-explainable imitator by discovering causal relationships between states and actions is presented. The method includes obtaining, via an acquisition component, demonstrations of a target task from experts for training a model to generate a learned policy, training the model, via a learning component, the learning component computing actions to be taken with respect to states, generating, via a dynamic causal discovery component, dynamic causal graphs for each environment state, encoding, via a causal encoding component, discovered causal relationships by updating state variable embeddings, and outputting, via an output component, the learned policy including trajectories similar to the demonstrations from the experts.
-
公开(公告)号:US20240062070A1
公开(公告)日:2024-02-22
申请号:US18450799
申请日:2023-08-16
Applicant: NEC Laboratories America, Inc.
Inventor: Wenchao Yu , Haifeng Chen , Tianxiang Zhao
Abstract: Methods and systems for training a model include performing skill discovery, using a set of demonstrations that includes known-good demonstrations and noisy demonstrations, to generate a set of skills. A unidirectional skill embedding model is trained in a first training while parameters of a skill matching model and low-level policies that relate skills to actions are held constant. The unidirectional skill embedding model, the skill matching model, and the low-level policies are trained together in an end-to-end fashion in a second training.
-
公开(公告)号:US20240046127A1
公开(公告)日:2024-02-08
申请号:US18471558
申请日:2023-09-21
Applicant: NEC Laboratories America, Inc.
Inventor: Wenchao Yu , Wei Cheng , Haifeng Chen , Yuncong Chen , Xuchao Zhang , Tianxiang Zhao
IPC: G06N7/01
Abstract: A method for learning a self-explainable imitator by discovering causal relationships between states and actions is presented. The method includes obtaining, via an acquisition component, demonstrations of a target task from experts for training a model to generate a learned policy, training the model, via a learning component, the learning component computing actions to be taken with respect to states, generating, via a dynamic causal discovery component, dynamic causal graphs for each environment state, encoding, via a causal encoding component, discovered causal relationships by updating state variable embeddings, and outputting, via an output component, the learned policy including trajectories similar to the demonstrations from the experts.
-
-
-
-