Ernest 9.0 is a honey-bee. She likes roaming toward flashy colors in her environment and she uses colored spots as landmarks for navigation.
She is alternatively in two different states: hungry or thirsty. When she is hungry, she has a violet thorax; when she is thirsty, she has a blue thorax. She does not know a priori what landmark can be eaten or drunk so she tries them all.
In this experiment, we can see a first learning phase (steps 0-150) where she learns sensorymotor contingencies by following her intrinsic motivation to visit landmarks (as before and discussed here). While doing so, she also learns that only blue landmarks can be drunk. She has an additional taste sense that informs her if the place where she is standing can be eaten or drunk. After step 150, she begins roaming more efficiently from landmarks to landmarks until she ends up on the violet landmarks and discovers that this can be eaten (steps 150-380).
She associates landmarks with the fulfillment of specific needs. For example, when she is thirsty, she memorizes roughly how long she travels from each specific landmark to a place where she drinks. When she gets thirsty again, she navigates preferably toward the landmarks that are the closest to the drinking area as she remembers.
This prompts her to go back to the drinking area event though the blue square is hidden behind the wall (steps 380-430). Similarly, she goes back to the eating area after drinking. Distance values are updated overtime so the navigation improves based on experience and on environmental regularities (if we introduce blue and violet squares without respecting a drinking and an eating area, she only finds them through random browsing).
To obtain these behaviors, we improved both the visual system and the motivational system in a tightly intertwined way, as we will report next.
Olivier Georgeon's research blog—also known as the story of little Ernest, the developmental agent. Keywords: situated cognition, constructivist learning, intrinsic motivation, bottom-up self-programming, individuation, theory of enaction, developmental learning, artificial sense-making, biologically inspired cognitive architectures, agnostic agents (without ontological assumptions about the environment).
Monday, February 21, 2011
Thursday, February 10, 2011
Ernest 8.4
Ernest 8.4 is the same as Ernest 8.3 except that Ernest 8.4's eyes can distinguish between various colors. Particularly, Ernest 8.4 can distinguish between two kinds of targets: blue targets and violet targets.
Ernest 8.4's eyes can also distinguish between singularities in the constitution of walls: orange bricks and yellow bricks. So far, however, Ernest's motivational system does not exploit these distinctions. Bricks leave Ernest 8.4 totally indifferent.
The next step will consist of making Ernest's motivations depend on internal states. For example, Ernest would pursue blue targets when he is "thirsty" and violet targets when he is "hungry". Moreover, we would like Ernest to learn to use environmental singularities as landmarks to navigate toward the desired target. Our idea is to make Ernest autonomously acquire new motivations to navigate toward landmarks that he has associated with desired targets.
Ernest 8.4's eyes can also distinguish between singularities in the constitution of walls: orange bricks and yellow bricks. So far, however, Ernest's motivational system does not exploit these distinctions. Bricks leave Ernest 8.4 totally indifferent.
The next step will consist of making Ernest's motivations depend on internal states. For example, Ernest would pursue blue targets when he is "thirsty" and violet targets when he is "hungry". Moreover, we would like Ernest to learn to use environmental singularities as landmarks to navigate toward the desired target. Our idea is to make Ernest autonomously acquire new motivations to navigate toward landmarks that he has associated with desired targets.
Wednesday, February 9, 2011
Ernest 8.3 The Ernestor
Ernest 8.3 is the same as Ernest 8.2; only the interface with his environment has changed.
Ernest 8.3's turning actions make him turn PI/4 rather than PI/2 before. Accordingly, Ernest 8.3 can now move forward in diagonal. Also, Ernest 8.3's eyes have a narrower angular span of PI/8 each (rather than PI/2 with Ernest 8.2).
These settings require a longer learning phase than before because of the topological differences between diagonals and straight lines, and because of the reduced visual field that implies more complex behaviors to find the targets. The fact that the same Ernest algorithm can learn to deal with these different settings demonstrates again the algorithm's robustness.
This example video shows that Ernest is now a pretty serious predator in the grid world. We call him Ernestor-Rex or e-Rex. As opposed to poor Ernest 8.2, The Ernestor-Rex does not get trapped into infinite loops between preys. See step 330 and further. This is because his narrow visual field makes him take care of a single prey at a time.
Ernest 8.3's turning actions make him turn PI/4 rather than PI/2 before. Accordingly, Ernest 8.3 can now move forward in diagonal. Also, Ernest 8.3's eyes have a narrower angular span of PI/8 each (rather than PI/2 with Ernest 8.2).
These settings require a longer learning phase than before because of the topological differences between diagonals and straight lines, and because of the reduced visual field that implies more complex behaviors to find the targets. The fact that the same Ernest algorithm can learn to deal with these different settings demonstrates again the algorithm's robustness.
This example video shows that Ernest is now a pretty serious predator in the grid world. We call him Ernestor-Rex or e-Rex. As opposed to poor Ernest 8.2, The Ernestor-Rex does not get trapped into infinite loops between preys. See step 330 and further. This is because his narrow visual field makes him take care of a single prey at a time.
Friday, February 4, 2011
Poor Ernest 8.2
In this example, Ernest found another strategy that consisted of moving on a straight line while systematically checking on his side to see if he became aligned with the blue square.
On step 120, we created a situation where the blue square would get hidden by a wall when Ernest would enact this strategy. When he saw the blue square, Ernest started moving toward it, but then he arrived to a point where the blue square became hidden behind the wall. This situation illustrates again that Ernest is not driven by a final goal but by rudimentary intrinsic motivations. When the blue square attraction disappears, Ernest just stops and spins in place (motivated to look for a new blue square).
On step 230, we inserted two blue squares. In this particular instance, Ernest got locked in an infinite loop between the two blue squares. Again, Ernest's behavior fits the subjective explanation that he just enjoys moving toward blue squares, which he can keep doing continuously in this specific loop.
Yet, we would like Ernest to be smarter and use a bit more determination to find blue squares. The next step might be of learning to recognize specific locations in space. Learning persistence of spatial locations might be an interesting prerequisite before learning persistence of objects. To manage to recognize specific locations, Ernest will need a better visual system.
On step 120, we created a situation where the blue square would get hidden by a wall when Ernest would enact this strategy. When he saw the blue square, Ernest started moving toward it, but then he arrived to a point where the blue square became hidden behind the wall. This situation illustrates again that Ernest is not driven by a final goal but by rudimentary intrinsic motivations. When the blue square attraction disappears, Ernest just stops and spins in place (motivated to look for a new blue square).
On step 230, we inserted two blue squares. In this particular instance, Ernest got locked in an infinite loop between the two blue squares. Again, Ernest's behavior fits the subjective explanation that he just enjoys moving toward blue squares, which he can keep doing continuously in this specific loop.
Yet, we would like Ernest to be smarter and use a bit more determination to find blue squares. The next step might be of learning to recognize specific locations in space. Learning persistence of spatial locations might be an interesting prerequisite before learning persistence of objects. To manage to recognize specific locations, Ernest will need a better visual system.
Subscribe to:
Posts (Atom)