DYNAMIC CAUSAL DISCOVERY IN IMITATION LEARNING

    公开(公告)号:US20240046128A1

    公开(公告)日:2024-02-08

    申请号:US18471564

    申请日:2023-09-21

    CPC classification number: G16H50/20

    Abstract: A method for learning a self-explainable imitator by discovering causal relationships between states and actions is presented. The method includes obtaining, via an acquisition component, demonstrations of a target task from experts for training a model to generate a learned policy, training the model, via a learning component, the learning component computing actions to be taken with respect to states, generating, via a dynamic causal discovery component, dynamic causal graphs for each environment state, encoding, via a causal encoding component, discovered causal relationships by updating state variable embeddings, and outputting, via an output component, the learned policy including trajectories similar to the demonstrations from the experts.

    DYNAMIC CAUSAL DISCOVERY IN IMITATION LEARNING

    公开(公告)号:US20240054373A1

    公开(公告)日:2024-02-15

    申请号:US18471570

    申请日:2023-09-21

    CPC classification number: G06N7/01 G06N20/00

    Abstract: A method for learning a self-explainable imitator by discovering causal relationships between states and actions is presented. The method includes obtaining, via an acquisition component, demonstrations of a target task from experts for training a model to generate a learned policy, training the model, via a learning component, the learning component computing actions to be taken with respect to states, generating, via a dynamic causal discovery component, dynamic causal graphs for each environment state, encoding, via a causal encoding component, discovered causal relationships by updating state variable embeddings, and outputting, via an output component, the learned policy including trajectories similar to the demonstrations from the experts.

    DYNAMIC CAUSAL DISCOVERY IN IMITATION LEARNING

    公开(公告)号:US20230080424A1

    公开(公告)日:2023-03-16

    申请号:US17877081

    申请日:2022-07-29

    Abstract: A method for learning a self-explainable imitator by discovering causal relationships between states and actions is presented. The method includes obtaining, via an acquisition component, demonstrations of a target task from experts for training a model to generate a learned policy, training the model, via a learning component, the learning component computing actions to be taken with respect to states, generating, via a dynamic causal discovery component, dynamic causal graphs for each environment state, encoding, via a causal encoding component, discovered causal relationships by updating state variable embeddings, and outputting, via an output component, the learned policy including trajectories similar to the demonstrations from the experts.

    SKILL DISCOVERY FOR IMITATION LEARNING
    4.
    发明公开

    公开(公告)号:US20240062070A1

    公开(公告)日:2024-02-22

    申请号:US18450799

    申请日:2023-08-16

    CPC classification number: G06N3/092 G06N3/045

    Abstract: Methods and systems for training a model include performing skill discovery, using a set of demonstrations that includes known-good demonstrations and noisy demonstrations, to generate a set of skills. A unidirectional skill embedding model is trained in a first training while parameters of a skill matching model and low-level policies that relate skills to actions are held constant. The unidirectional skill embedding model, the skill matching model, and the low-level policies are trained together in an end-to-end fashion in a second training.

    DYNAMIC CAUSAL DISCOVERY IN IMITATION LEARNING

    公开(公告)号:US20240046127A1

    公开(公告)日:2024-02-08

    申请号:US18471558

    申请日:2023-09-21

    CPC classification number: G06N7/01 G06N20/00

    Abstract: A method for learning a self-explainable imitator by discovering causal relationships between states and actions is presented. The method includes obtaining, via an acquisition component, demonstrations of a target task from experts for training a model to generate a learned policy, training the model, via a learning component, the learning component computing actions to be taken with respect to states, generating, via a dynamic causal discovery component, dynamic causal graphs for each environment state, encoding, via a causal encoding component, discovered causal relationships by updating state variable embeddings, and outputting, via an output component, the learned policy including trajectories similar to the demonstrations from the experts.

Patent Agency Ranking