摘要:
It is possible to perform robot motor learning in a quick and stable manner using a reinforcement learning apparatus including: a first-type environment parameter obtaining unit that obtains a value of one or more first-type environment parameters; a control parameter value calculation unit that calculates a value of one or more control parameters maximizing a reward by using the value of the one or more first-type environment parameters; a control parameter value output unit that outputs the value of the one or more control parameters to the control object; a second-type environment parameter obtaining unit that obtains a value of one or more second-type environment parameters; a virtual external force calculation unit that calculates the virtual external force by using the value of the one or more second-type environment parameters; and a virtual external force output unit that outputs the virtual external force to the control object.
摘要:
A robot and a behavior control system for the same are capable of ensuring continued stability while carrying out a specified task by a motion of a body of the robot. Time-series changing patterns of first state variables indicating a motional state of an arm are generated according to a stochastic transition model such that at least one of the first state variables follows a first specified motion trajectory for causing the robot to carry out a specified task. Similarly, time-series changing patterns of second state variables indicating a motional state of the body are generated according to the stochastic transition model such that the second state variables satisfy a continuously stable dynamic condition.
摘要:
The present invention provides a motion control system to control a motion of a second motion body, by considering an environment which a human contacts and a motion mode appropriate to the environment, and an environment which a robot actually contacts. The motion mode is learned based on an idea that it is sufficient to learn only a feature part of the motion mode of the human without a necessity to learn the others. Moreover, based on an idea that it is sufficient to reproduce only the feature part of the motion mode of the human without a necessity to reproduce the others, the motion mode of the robot is controlled by using the model obtained from the learning result. Thereby, the motion mode of the robot is controlled by using the motion mode of the human as a prototype without restricting the motion mode thereof more than necessary.
摘要:
A robot and a behavior control system for the same are capable of ensuring continued stability while carrying out a specified task by a motion of a body of the robot. Time-series changing patterns of first state variables indicating a motional state of an arm are generated according to a stochastic transition model such that at least one of the first state variables follows a first specified motion trajectory for causing the robot to carry out a specified task. Similarly, time-series changing patterns of second state variables indicating a motional state of the body are generated according to the stochastic transition model such that the second state variables satisfy a continuously stable dynamic condition.
摘要:
In a mobile robot control system, it is configured such that the robot generates time-series data sequentially at a predetermined time interval and transmits them to the external terminal, and the external terminal receives the transmitted time-series data and adds them to the motion command, such that the motion of the robot is determined based on the generated time-series data and the time-series data added to the motion command. With this, it becomes possible to prevent the robot from suddenly starting to move at the time when the communication between the external terminal which is a transmitting source of the motion command and the robot has recovered from disconnection, thereby enabling to avoid making the operator feel unnatural.
摘要:
A reinforcement learning system (1) of the present invention utilizes a value of a first value gradient function (dV1/dt) in the learning performed by a second learning device (122), namely in evaluating a second reward (r2(t)). The first value gradient function (dV1/dt) is a temporal differential of a first value function (V1) which is defined according to a first reward (r1(t)) obtained from an environment and is served as a learning result given by a first learning device (121). An action policy which should be taken by a robot (R) to execute a task is determined based on the second reward (r2(t)).
摘要:
A provisional desired motion trajectory of an object is determined based on a moving plan of the object. Then, it is determined whether a robot leg motion can satisfy a necessary requirement. The requirement is related to a position/posture relationship between the object and the robot, and a determination of whether the requirement can be satisfied is made at a future, predetermined step. A restrictive condition related to robot leg motion is satisfied at each step up to the predetermined number of steps. If the requirement is satisfied, then a desired gait is generated on the basis of the provisional desired motion trajectory. Otherwise, a desired gait is generated on the basis of a desired motion trajectory of the object according to a corrected moving plan.
摘要:
In a legged mobile robot, in particular a biped robot having arms, the control is conducted such that the dynamic balance is preserved so as to keep a stable posture, even when the robot is subject to unexpected reaction force from an object. It is configured such that the difference or error (i.e., moment about the central point of total floor reaction force) between the desired object reaction force, and the actual value is determined and is distributed to the desired body position/posture and the desired feet position/posture, and based thereon, robot link joints are controlled to be driven.
摘要:
In a biped walking robot having a body and two articulated legs each connected to the body through a hip joint and having a knee joint and an ankle joint, connected by a shank link, a knee pad is mounted on the shank link as a landing/shock absorbing means at a position adjacent to the knee joint which is brought into contact with the floor when coming into knee-first contact with the floor such that the knee joint is to be positioned at a location forward of the center of gravity of the robot in a direction of robot advance, while absorbing impact occurring from the contact with the floor. With this, the robot can be easily stood up from an attitude with its knee joint regions in contact with the floor. Moreover, when coming into knee-first contact with the floor, it can absorb the impact of the contact to protect the knee joint regions and the floor from damage.
摘要:
A trajectory generation for a member such as a foot of a legged mobile robot. First, basic trajectories defining some typical motions of the foot including a constraint condition are established on a virtual plane or surface fixed on a coordinate system. The virtual plane is kept fixed on the ground until a time the foot is to be lifted. Then at this period free from the constraint condition, the coordinate system is displaced such that the virtual surface coincides with another point of the ground on which the foot is to be landed. A trajectory for a footrise to footfall is thus generated by combining the basic trajectories in the coordinate system and the amount of displacement of the coordinate system. Thus, the boundary conditions become extremely simple and hence, trajectory generation is greatly simplified. A real time trajectory correction can be conducted if desired.