发明授权
- 专利标题: Value function representation method of reinforcement learning and apparatus using this
- 专利标题(中): 强化学习的价值函数表示方法及其使用的装置
-
申请号: US12065558申请日: 2006-08-18
-
公开(公告)号: US08175982B2公开(公告)日: 2012-05-08
- 发明人: Tomoki Hamagami , Takesi Shibuya
- 申请人: Tomoki Hamagami , Takesi Shibuya
- 申请人地址: JP Yokohama
- 专利权人: Nat'l University Corp. Yokohama Nat'l University
- 当前专利权人: Nat'l University Corp. Yokohama Nat'l University
- 当前专利权人地址: JP Yokohama
- 代理机构: Westerman, Hattori, Daniels & Adrian, LLP
- 优先权: JP2005-254763 20050902
- 国际申请: PCT/JP2006/316659 WO 20060818
- 国际公布: WO2007/029516 WO 20070315
- 主分类号: G06F15/18
- IPC分类号: G06F15/18 ; G06E1/00
摘要:
Reinforcement learning is one of the intellectual operations applied to autonomously moving robots etc. It is a system having excellent sides, for example, enabling operation in unknown environments. However, it has the basic problem called the “incomplete perception problem”. A variety of solution has been proposed, but none has been decisive. The systems also become complex. A simple and effective method of solution has been desired.A complex value function defining a state-action value by a complex number is introduced. Time series information is introduced into a phase part of the complex number value. Due to this, the time series information is introduced into the value function without using a complex algorithm, so the incomplete perception problem is effectively solved by simple loading of the method.
公开/授权文献
信息查询