发明申请
- 专利标题: VALUE FUNCTION REPRESENTATION METHOD OF REINFORCEMENT LEARNING AND APPARATUS USING THIS
- 专利标题(中): 加强学习和设备的价值功能表征方法
-
申请号: US12065558申请日: 2006-08-18
-
公开(公告)号: US20090234783A1公开(公告)日: 2009-09-17
- 发明人: Tomoki Hamagami , Takeshi Shibuya
- 申请人: Tomoki Hamagami , Takeshi Shibuya
- 申请人地址: JP Yokohama-shi, KANAGAWA
- 专利权人: National University Corporation Yokohama National University
- 当前专利权人: National University Corporation Yokohama National University
- 当前专利权人地址: JP Yokohama-shi, KANAGAWA
- 优先权: JP2005-254763 20050902
- 国际申请: PCT/JP2006/316659 WO 20060818
- 主分类号: G06F15/18
- IPC分类号: G06F15/18
摘要:
Reinforcement learning is one of the intellectual operations applied to autonomously moving robots etc. It is a system having excellent sides, for example, enabling operation in unknown environments. However, it has the basic problem called the “incomplete perception problem”. A variety of solution has been proposed, but none has been decisive. The systems also become complex. A simple and effective method of solution has been desired.A complex value function defining a state-action value by a complex number is introduced. Time series information is introduced into a phase part of the complex number value. Due to this, the time series information is introduced into the value function without using a complex algorithm, so the incomplete perception problem is effectively solved by simple loading of the method.
公开/授权文献
信息查询