Method of selection of an action for an object using a neural network

发明授权

US10935982B2 Method of selection of an action for an object using a neural network 有权

请登陆查看更多内容

专利标题： Method of selection of an action for an object using a neural network
申请号： US15724939

申请日： 2017-10-04
公开(公告)号： US10935982B2

公开(公告)日： 2021-03-02
发明人: Hengshuai Yao , Hao Chen , Seyed Masoud Nosrati , Peyman Yadmellat , Yunfei Zhang
申请人： Hengshuai Yao , Hao Chen , Seyed Masoud Nosrati , Peyman Yadmellat , Yunfei Zhang
申请人地址： CA Markham; CA Ottawa; CA Markham; CA North York; CA Aurora
专利权人： Hengshuai Yao,Hao Chen,Seyed Masoud Nosrati,Peyman Yadmellat,Yunfei Zhang
当前专利权人： Hengshuai Yao,Hao Chen,Seyed Masoud Nosrati,Peyman Yadmellat,Yunfei Zhang
当前专利权人地址： CA Markham; CA Ottawa; CA Markham; CA North York; CA Aurora
主分类号： G05D1/02
IPC分类号： G05D1/02 ; G06N3/04 ; G06N3/00 ; B60W40/12 ; G06N3/08 ; G06N3/02

Method of selection of an action for an object using a neural network

摘要：

A method, device and system of prediction of a state of an object in the environment using a pre-trained action model defined by an action model neural network. A control system for an object comprises a plurality of sensors for sensing a current state and an environment in which the object is located, and a first neural network. Predicted subsequent states of the object in the environment are obtained using the action model and a current state of the object in the environment The action model maps a plurality of state-action pairs (s, a), each state-action pair encoding a state (s) of the object in the environment and an action (a) performed by the object to a predicted subsequent state (s′) of the object in the environment. An action that maximizes a value of a target, based at least on a reward for each of the predicted subsequent states, is determined. The determined action is caused to be performed.

公开/授权文献

US20190101917A1 METHOD OF SELECTION OF AN ACTION FOR AN OBJECT USING A NEURAL NETWORK 公开/授权日：2019-04-04

信息查询

Espacenet

IPC分类:

G	物理
G05	控制；调节
G05D	非电变量的控制或调节系统（金属的连续铸造入B22D11/16；阀门本身入F16K；非电变量的检测见G01各有关小类；电或磁变量的调节入G05F）
G05D1/00	陆地、水上、空中或太空中的运载工具的位置、航道、高度或姿态的控制，例如自动驾驶仪（无线电导航系统或使用其他波的类似系统入G01S）
G05D1/02	.二维的位置或航道控制