Artifical Developmental Learning: 2011

Tuesday, November 29, 2011

Ernest 10.5

Ernest 10.5 has social drives. He loves cuddling with other Ernests.

See the 3D demo here (better with Chrome).

Ernest 10.5 is similar to Ernest 10.4 except that he constructs bundles to represent other Ernests. These bundles become attractive when he discovers that he can cuddle with other Ernests (same principle as with food).

Thursday, November 10, 2011

Ernest 10.4

Ernest 10.4 has an improved local-space memory and an improved bundle construction mechanism (see our discussion on bundles here).

On step 188, there is no more fish within the range of Ernest's vision nor within its tactile perception, but Ernest is still aware of the presence of a fish behind him. His local-space awareness makes him turn back towards that fish.

Note that Ernest's spatial awareness is still local: when he is in the lower part of the board, he is unaware of the presence of fish in the upper part of the board.

(Video also available on YouTube)

Friday, November 4, 2011

Ernest in NetLogo

Please click here to try the demonstration of Ernest 8 in NetLogo. This demonstration was developed by Ilias Sakellariou with the University of Macedonia.

We kept Ernest's algorithm in Java. We developed a NetLog extension to access Ernest's java algorithm from within NetLogo. We called this extension IMOS (Intrinsic MOtivation System). Like the Ernest algorithm, IMOS is open source. We are happy to share it and provide all necessary support.

IMOS allows you to easily include environment-agnostic intrinsically motivated agents in your NetLogo models.

Wednesday, September 14, 2011

Ernest 10.3 continuous

Ernest 10.3 works as well in a continuous space. This again demonstrates the robustness of the algorithm.

In this experiment, however, the time is still "discrete": Ernest 10.3 cannot turn while he is still moving forward or vice versa. We now need to investigate what happens if the next action can be triggered before the previous action was completed.

Also, we are still investigating how Ernest can construct a representation of his surrounding space. This should be easier in a continuous space than in a grid because all directions appear equivalent.

This continuous version of the environment was developed by Simon Gay.

Thursday, September 8, 2011

Ernest 10.3 - 3D

You can now interact with Ernest 10.3 in 3D (better with Chrome but works also with Firefox — requires a recent graphic processing unit).

The 3D environment is called SECA for Simulated Environment for Constructivist Agents. SECA was developed by Olivier Voisin as part of his master thesis.

Thursday, September 1, 2011

Abstract-Lite

We are pleased to announce that our trace visualization tool is now available for your free utilization online.

This visualization tool is called Abstract-Lite. A full documentation is also provided online.

Abstract-Lite offers an interactive interface to visualize your activity traces. It relies on XPATH instructions and XSL stylesheets to help you customize your visualizations. Abstract-Lite can display your traces in real time. For an example, see Ernest 10.3's trace.

Abstract-Lite was developed by Pierre-Yves Ronot as part of the ABSTRACT project and the IDEAL project.

Friday, July 22, 2011

An Intrinsically-Motivated Schema Mechanism to Model and Simulate Emergent Cognition

An Intrinsically-Motivated Schema Mechanism to Model and Simulate Emergent Cognition. Olivier Georgeon, Frank Ritter (2012). Cognitive Systems Research, 15-16. pp. 73-92. doi:10.1016/j.cogsys.2011.07.003.

This is the main paper that presents Ernest's algorithm. Specifically, the paper provides a detailed presentation of Ernest 7.2 and discusses the implications for the study of emergent cognition.

Tuesday, June 14, 2011

Early-Stage vision of composite scenes

Early-Stage Vision of Composite Scenes for Spatial Learning and Navigation. In the proceedings of the First Joint IEEE Conference on Development and Learning and on Epigenetic Robotics (ICDL-EPIROB 2011). Olivier L. Georgeon, James B. Marshall, Pierre-Yves R. Ronot. Frankfurt (24-27 August 2011). pp. 224-229.

This 6-page paper reports Ernest 9.1's experiment and discusses how Ernest learns early-stage sensorimotor and spatial regularities. The paper also introduces Ernest 9's cognitive architecture.

Vision as a Developmental Sensorimotor Process

A Model and simulation of Early-Stage Vision as a Developmental Sensorimotor Process. In the proceedings of Artificial Intelligence, Applications and Innovations (AIAI 2011). Olivier Georgeon, Mark Cohen, Amélie Cordier. Corfu, Greece (15-18 September 2011). pp. 11-16.

This 6-page paper presents Ernest 8.2 and discusses the connection with the sensorimotor theory of vision (O'Regan & Noë, 2001). This study suggests to implement a visual system that detects changes in the visual field rather than static features.

Monday, May 30, 2011

Train your Ernest

Thanks to Olivier Voisin, you can now train your own Ernest online in 3D: click here to access Ernest 8.3-3D (Preferably with Chrome, but works also with Firefox).

Saturday, May 28, 2011

Poor Ernest 10.3

We hesitate calling this experiment poor Ernest 10.3 or happy Ernest 10.3. Maybe this is the first implementation of artificial dizziness :-).

After the initial learning phase, we introduce fish that move (step 158). We were happily surprised to see that Ernest 10.3 was not that bad at catching moving fish. We find it fairly honorable for a purely feed-forward adaptive system.

Monday, May 23, 2011

Ernest 10.3's traces

This video shows the trace generated by Ernest 10.3 during the experiment reported in the previous post. Playing the two videos synchronously helps understand Ernest's activity.

As in Ernest 10.1's traces, the tape at the bottom represents the visual field. The twelve visual pixels are represented vertically as little rectangles when Ernest is moving forward, and as little trapezoids when Ernest is turning.

The second tape above the visual field represents the peripersonal map (known as the local map in Ernest 10.1). The grid cell in front of Ernest is represented in the center of the tape (in red when Ernest is bumping). The three cells on the left side of Ernest are represented in the upper part of the tape. The three cells on the right side of Ernest are represented in the lower part of the tape. The cell where Ernest is standing and the cell in the back of Ernest are not represented. The peripersonal map integrates stimulations from different sensory modalities: tactile (light gray, intermediary gray, and black), visual (colored), and kinematic (red). These different stimulations may be bundled together (See the discussion on bundles in Ernest 10.1).

The central part of the video represents Ernest's primitive interaction patterns (primitive enacted acts) that relate to Ernest's decisions (i.e., enacted acts both result from Ernest's previous decision and impact Ernest's next decision). The central line takes the color of the sensory salience that attracts Ernest's current attention (gray in the case of a tactile salience). Triangles that point outwards from the central line indicate that the salience is moving outwards—to the left when the triangle is above the line, and to the right when the triangle is below. Triangles that point inwards to the line indicate that the salience is moving inwards—from the left when the triangle is above the line, and from the right when the triangle is below. Little squares on the central line indicate that the salience of current attention is enlarging in the central area of the visual field. Large squares on the central line represent eating a fish (from gustatory stimulation).

Above the decisional tape is the motivational tape. The motivational tape represents Ernest's current satisfaction value as a little histogram. Positive satisfactions are displayed in green and negative satisfactions in red. See how Ernest enjoys eating fish :-).

Above the motivational tape, orange stripes and circles represent composite schemas that Ernest tries to enact as a whole sequence (from step 128 on). These sequences correspond to regularities of interaction that Ernest has discovered and learned; and Ernest is beginning to exploit these sequences. We can see that these sequences may contain steps with negative satisfaction values (steps 185, 195, 218...). This demonstrates that Ernest learns to knowingly decide to enact unsatisfying interaction in order to gain subsequent satisfying interaction.

Friday, May 20, 2011

Ernest 10.3

In his management of space, Ernest 10.3 distinguishes between two concentric zones: the peripersonal space and the extrapersonal space.

The peripersonal space is the realm of proximal perception: touch, kinematic perception (bumping), and taste (eating). The peripersonal space covers the 3x3 matrix of grid cells surrounding Ernest.

In contrast, the extrapersonal space is the realm of distal perception: sight (that also works as a kind of olfaction for Ernest). The extrapersonal space covers the rest of the surrounding world beyond the peripersonal space.

Researchers in neuropsychology have argued that these two spaces were handled by partially distinct parts of the brain in the vertebrate. Moreover, behaviors would be driven by different motivational systems. Behaviors in the peripersonal space would be related to consumatory motivation associated with the noradrenergic system. Behaviors in the extrapersonal space would be related to incentive associated with the dopaminergic system (Previc, 1998, p124).

Following these views, we have endowed Ernest with a (partially) distinct motivational system associated with each space. This video shows that this new dual-space motivational system makes Ernest much better at avoiding bumping and at catching fish in shoal. See steps 93-98 when Ernest hesitates between moving toward another shoal or keeping eating the same shoal.

Similar to Ernest 10.1, Ernest 10.3 constructs a local map of his peripersonal space (displayed in the upper-right corner). This local map is a place of multimodal integration that facilitates behavioral consistency across the two spaces, peripersonal and extrapersonal.

After the initial learning phase (roughly up to step 80 in this video), Ernest 10.3 exhibits a more coherent spatial behavior than poor Ernest 9.3, in particular, Ernest now avoids bumping as well as useless turning toward walls.

Also worthy of noting is that Ernest adopts a head bobbing behavior (like birds) consisting of turning left then right after eating (from step 128 on). We explain this strategy by that Ernest's sensory system is mostly sensitive to movement (see the discussion on Ernest 8.1 strategy learning).

Reference

Previc F.H, 1998, The neuropsychology of 3-D space. Psychological Bulletin, 124 (2). pp 124-164.

Friday, May 6, 2011

Ernest 10.2 simulates spatial movements

Ernest 10.2 anticipates the consequences of his actions in his local map. When the local map predicts that Ernest would bump into a wall, the action of moving forward is inhibited (not enacted) to prevent the bumping.

The local map is now displayed with the shape of a shark in square k6. Unlike in the previous experiment, this display shows the anticipation made during the previous step. For example, on step 16, the local map shows that Ernest was unable to anticipate (from step 15) the appearance of the yellow square in front of him. On step 17, however, he was able to anticipate (from step 16) that the yellow square would shift to the right when he turns to the left (the local map is most often incomplete and sometimes wrong).

When the local map predicts that Ernest would bump into a wall, the local map shows a red circle on Ernest's nose. This occurs for the first times on step 10, 32, and 58. On step 58, the fact that the bumping was prevented is shown by the fact that the wall in front of Ernest does not flash red.

As before, Ernest creates bundles that associate different sensory stimulations together. For example, on step 9, Ernest predicts that moving forward would make him stand on a wall. This false prediction is due to the fact that Ernest does not know yet that walls cause bumping. When Ernest actually experiences the bumping on step 9, he bundles together the tactile stimulation of walls with the bumping stimulation (the dark green color is not associated with this bundle because Ernest's visual attention was distracted by the yellow alga, which caused him to not see the wall). On the contrary, during step 90, Ernest did see the turquoise wall, which caused him to learn the "turquoise wall bundle".

We expect that Ernest's new coupling between his intrinsically-motivated sequential learning system and his spatial representation system can lead to valuable new developments. Yet, many issues remain. In particular, the current implementations of the local map and the bundle mechanism are based on some ad-hoc routines that we have hard-coded. For Ernest to adapt to other environments, these mechanisms need to be learned rather than hard-coded. We believe that such learning could be implemented with statistical methods, such as those used in robotics studies (e.g., Kuipers 2000).

References

Kuipers, B. (2000) The Spatial Semantic Hierarchy. Artificial Intelligence. 119 (1-2), pp 191-233.

Tuesday, April 26, 2011

Ernest 10.1's activity traces

Thanks to Pierre-Yves Ronot, we can now see Ernest's activity traces in real time.

In this video, the lower part of the trace shows Ernest's visual field. Visual pixels are represented as little rectangles when Ernest is moving forward, and as little trapezoids when Ernest is turning. Vertical red lines across the visual field indicate that Ernest is bumping into a wall. Blue circles indicate that he is eating a fish. The trace shows the different items in the environment going back and forth through the visual field as Ernest is moving.

We can see the initial phase of sensorymotor learning (steps 0-90) during which Ernest is often bumping into walls. Notably, his visual apparatus does not inform him about his distance to walls.

The line at the upper part of the trace takes the color of the item that recieves Ernest's current attention. This item of visual attention is also indicated by the "pie" in the local map as reported in the previous post.

For example, before eating his first fish, Ernest is distracted by other items (see steps 90 to 99). After having tasted a fish, however, Ernest focuses on fish whenever he sees them (steps 148 to 166).

Traces help us understand how Ernest sees his world. Ernest receives no other information from the world than that displayed in the trace. Traces also represent Ernest's experience that he encodes in his episodic memory. The encoding of experience in the form of activity episodes constitutes the base of Ernest's learning.

Monday, April 25, 2011

Ernest 10.1 constructs a local map of his environment

Ernest 10.1 associates co-occurring sensory stimulations together into bundles. A bundle is a set of sensory stimulations that denotes an object in the world. By constructing bundles, Ernest starts to perform multi-modal sensory integration.

We assume that a bundle can represent a unique kind of object in the world. We drew this assumption from David Hume's bundle theory of objects. This theory postulates that objects consist only of the set (bundle) of their observable properties.

Additionally, Ernest 10.1 constructs a local map of the bundles surrounding him. In this video, the local map is displayed in cell k6 (next to the cycle counter). The local map represents Ernest's awareness of his local surrounding. The top of the map corresponds to the front of Ernest. Ernest's current visual stimulation is represented as a "pie" over the map.

The local map is based on Ernest's somatotopic map. A bundle is learned by associating the tactile stimulation in front of Ernest (the center-top cell in the somatotopic map) with the simultaneous visual stimulation (the "pie"). As he moves, Ernest is then able to follow the newly-created (or recognized) bundle in the somatotopic map.

For example, Ernest creates a bundle for the yellow square on step 27. Then, on step 28, this bundle moves to the center of the map; on step 30, to the center-rear of the map; and, on step 31, to the left-rear corner of the map (the local map is displayed with a delay of one cycle).

When Ernest eats, he associates his gustatory stimulation with the bundle that represents where he is standing, that is, the "fish bundle" that was previously constructed from visual and tactile stimulations. This delicious taste associated with fish then makes Ernest's prefer pursuing fish rather than alga in subsequent activity.

Apart from his still-unexploited local map, and from the fact that his visual field was reduced to a single row of 12 pixels, Ernest 10.1 is the same as poor Ernest 9.3. We now need to investigate how Ernest would use his local map.

Friday, April 8, 2011

Ernest 10.0 has a somatotopic map

Ernest 10.0 is a shark. Sharks are archaic vertebrates whose brain didn't evolve much over the last 450 million years or so. Yet, a shark's brain contains the same set of basic anatomical components as modern vertebrates' brains (Brain, Wikipdia).

In particular, Ernest 10.0 has a somatotopic map—a brain area that represents Ernest's body in an isomorphic way. In humans, this area would correspond to the postcentral gyrus, also known as the primary somatosensory cortex.

In this video, Ernest's somatotopic map is represented as a grayscale grid over Ernest's body. This grid has 9 cells that represent what Ernest touches in the 8 surrounding squares plus the square where he is standing.

Each cell in the somatotopic map can reflect three different kinds of tactile feelings:
- light gray: only water.
- medium gray: something soft, an alga or a fish that Ernest can swim over.
- black: something hard, a wall or the aquarium's side (the central cell is never black because Ernest cannot stand on a wall).

We hypothesize that the somatotopic map will be useful for acquiring a sense of space, although we don't yet know exactly how. We drew this hypothesis from the idea that our sense of space comes from the mere fact that our body occupies space. We, however, could not find much arguments in the literature to support this hypothesis.

Apart from his still-unexploited somatotopic map, and from the fact that he is always hungry for fish, Ernest 10.0 is the same as poor Ernest 9.3. This video shows how miserable he is. He has no sense of space and he is even unable to "simplify" a sequence consisting or turning six times 45° clockwise into a sequence consisting of turning twice 45° counterclockwise (see steps 267 or 279).

Tuesday, April 5, 2011

Poor Ernest 9.3

On step 406, we removed the yellow-green landmark that Ernest was using to find his way towards the southeast field. This leaves poor Ernest spinning in place, miserably goalless.

To explore space in the absence of landmarks, Ernest will need some skills to construct a spatial representation of the environment. Such skills are exhibited by most vertebrates and are believed to involve some basic components of the vertebrate's brain. We now need to explore the role of these components.

Friday, April 1, 2011

Ernest 9.3

The colliculus activation algorithm has been improved. This gives Ernest 9.3 a smoother behavior than Ernest 9.2 (e.g., see the elegant curve taken on the way back to the hive on steps 356-388).

In addition, Ernest's thorax now takes the color of the most motivating landmark in the visual field—the landmark that raises the most activation in the colliculus. This landmark is the object of Ernest's current attention and drives Ernest's homing tendency.

At the beginning, Ernest is seeking to gather pollen but he does not know how to distinguish flowers from other landmarks. The highest activation is only given by the largest landmark in the visual field (i.e., the closest landmark). This causes Ernest to visit all landmarks randomly. When a landmark is visited, this landmark's activation is lowered for some time, meaning that Ernest temporarily looses interest in this landmark. This causes Ernest to move on to another landmark.

On step 38, Ernest finds a landmark from which he can gather the pollen (the blue flower). This switches his motivation to make him seek for the hive. In the hive-seeking motivational mode, landmarks raise an additional activation that is proportional to their proximity to the hive (or to Ernest's place of birth) as much as Ernest remembers from his way out. This activation mechanism causes Ernest to go back to the yellow square by traveling from known landmarks to known landmarks. After reaching the yellow square (step 83) he finds no more motivating landmarks and starts a random exploration again until he finds the hive for the first time on step 94.

When on the hive, Ernest drops the pollen and switches back to the pollen-seeking motivational mode. Now, the highest activation is generated by landmarks that Ernest remembers as being the closest to the pollen. This leads Ernest back to the northeast flower field. On the second way back, Ernest is now able to recognize the hive, which causes him to turn directly towards the hive on step 159 (by-passing the yellow square).

When the northeast flower field is empty, Ernest fumbles again until he sees new landmarks to explore (green landmarks in the southeast field on step 311). From then on, he starts exploiting the southeast field the same way he exploited the northeast field.

Monday, March 28, 2011

Ernest 9.2 has a superior colliculus

Ernest 9.2 has a visual resolution of 2 rows of 12 pixels. As before, Ernest can see above colored landmarks. The closest landmarks are seen in the first row, and possible landmarks behind these are seen above in the second row. Ernest 9.2's visual angular span equals 180°. As before, Ernest 9.2 cannot see through walls.

In this video, Ernest's half-circular head represents what Ernest sees. The first row is represented inside the half circle. The second row is represented on the half circle's crown. For example, on step 34, the inside of the half circle takes a light-green color because Ernest sees the square where it is standing, and the crown reflects the flower and the other squares that Ernest sees over the light-green square.

The environment is the same as with Ernest 9.1. The hive and the flowers are now represented as icons but Ernest distinguishes them only by their color and taste as before.

The most significant improvement is that Ernest 9.2 has a superior colliculus—a brain region also known as the tectum in the invertebrate. The superior colliculus maintains an internal retinotopic representation of the animal's surrounding environment. The superior colliculus is used to orient motivation and behavior towards a specific direction in the animal's egocentric referential (in some rudimentary vertebrates, like the hagfish, the superior colliculus constitutes the biggest brain region).

The effects of Ernest's colliculus are first seen on step 110. At this point, Ernest was heading toward the light-green square that was already known as leading to the flower field. On step 110, a flower appeared in the right side of the visual field. Because the flower was more motivating than the light-green square, the flower generated more activation in the colliculus's right side than that generated by the light-green square in the colliculus's center. This activation triggered a signal sent to the sequential system that caused Ernest to turn to the right towards the flower.

Ernest 9.2's initial phase of sensorymotor contigency learning lasts longer than before because of the increased complexity of the visual system. In this video, this initial learning phase roughly goes up to step 100. Once sensorymotor contingencies are learned, the pollen gathering is faster and steadier than with Ernest 9.1. Ernest 9.2 finishes gathering the five flowers on step 414, whereas Ernest 9.1 took 598 steps in the previous example run.

Tuesday, March 1, 2011

Ernest 9.1 gathers pollen into the hive

Ernest 9.1 is similar to Ernest 9.0 but the bee has some new ways to interact with singularities in the environment. As before, she must visit a singularity to know what possibilities of interaction this singularity offers to her.

The violet square now represents the nest where she would gather the pollen. Colored squares are low landmarks that she can fly over, except the turquoise square that is the wall corner into which she would bump. As before, blue squares are flowers from which she can collect pollen.

Steps 0-60: initial phase of sensorymotor contingencies learning (as discussed before). She finds the nest on step 17, then continues exploring.
Step 98: she finds the first pollen.
Steps 99-163: she fumbles back to the nest.
Steps 163: she drops the pollen into the nest.
Steps 165-275: a second gathering cycle where she is still fumbling on her way back to the nest.
Steps 278-363: the third gathering cycle. This time, she finds the direct way back to the nest.
Steps 364-600: she explores the second flower field and adapts her way back to the nest.

Monday, February 21, 2011

Ernest 9.0 can roam in flower fields

Ernest 9.0 is a honey-bee. She likes roaming toward flashy colors in her environment and she uses colored spots as landmarks for navigation.

She is alternatively in two different states: hungry or thirsty. When she is hungry, she has a violet thorax; when she is thirsty, she has a blue thorax. She does not know a priori what landmark can be eaten or drunk so she tries them all.

In this experiment, we can see a first learning phase (steps 0-150) where she learns sensorymotor contingencies by following her intrinsic motivation to visit landmarks (as before and discussed here). While doing so, she also learns that only blue landmarks can be drunk. She has an additional taste sense that informs her if the place where she is standing can be eaten or drunk. After step 150, she begins roaming more efficiently from landmarks to landmarks until she ends up on the violet landmarks and discovers that this can be eaten (steps 150-380).

She associates landmarks with the fulfillment of specific needs. For example, when she is thirsty, she memorizes roughly how long she travels from each specific landmark to a place where she drinks. When she gets thirsty again, she navigates preferably toward the landmarks that are the closest to the drinking area as she remembers.

This prompts her to go back to the drinking area event though the blue square is hidden behind the wall (steps 380-430). Similarly, she goes back to the eating area after drinking. Distance values are updated overtime so the navigation improves based on experience and on environmental regularities (if we introduce blue and violet squares without respecting a drinking and an eating area, she only finds them through random browsing).

To obtain these behaviors, we improved both the visual system and the motivational system in a tightly intertwined way, as we will report next.

Thursday, February 10, 2011

Ernest 8.4

Ernest 8.4 is the same as Ernest 8.3 except that Ernest 8.4's eyes can distinguish between various colors. Particularly, Ernest 8.4 can distinguish between two kinds of targets: blue targets and violet targets.

Ernest 8.4's eyes can also distinguish between singularities in the constitution of walls: orange bricks and yellow bricks. So far, however, Ernest's motivational system does not exploit these distinctions. Bricks leave Ernest 8.4 totally indifferent.

The next step will consist of making Ernest's motivations depend on internal states. For example, Ernest would pursue blue targets when he is "thirsty" and violet targets when he is "hungry". Moreover, we would like Ernest to learn to use environmental singularities as landmarks to navigate toward the desired target. Our idea is to make Ernest autonomously acquire new motivations to navigate toward landmarks that he has associated with desired targets.

Wednesday, February 9, 2011

Ernest 8.3 The Ernestor

Ernest 8.3 is the same as Ernest 8.2; only the interface with his environment has changed.

Ernest 8.3's turning actions make him turn PI/4 rather than PI/2 before. Accordingly, Ernest 8.3 can now move forward in diagonal. Also, Ernest 8.3's eyes have a narrower angular span of PI/8 each (rather than PI/2 with Ernest 8.2).

These settings require a longer learning phase than before because of the topological differences between diagonals and straight lines, and because of the reduced visual field that implies more complex behaviors to find the targets. The fact that the same Ernest algorithm can learn to deal with these different settings demonstrates again the algorithm's robustness.

This example video shows that Ernest is now a pretty serious predator in the grid world. We call him Ernestor-Rex or e-Rex. As opposed to poor Ernest 8.2, The Ernestor-Rex does not get trapped into infinite loops between preys. See step 330 and further. This is because his narrow visual field makes him take care of a single prey at a time.

Friday, February 4, 2011

Poor Ernest 8.2

In this example, Ernest found another strategy that consisted of moving on a straight line while systematically checking on his side to see if he became aligned with the blue square.

On step 120, we created a situation where the blue square would get hidden by a wall when Ernest would enact this strategy. When he saw the blue square, Ernest started moving toward it, but then he arrived to a point where the blue square became hidden behind the wall. This situation illustrates again that Ernest is not driven by a final goal but by rudimentary intrinsic motivations. When the blue square attraction disappears, Ernest just stops and spins in place (motivated to look for a new blue square).

On step 230, we inserted two blue squares. In this particular instance, Ernest got locked in an infinite loop between the two blue squares. Again, Ernest's behavior fits the subjective explanation that he just enjoys moving toward blue squares, which he can keep doing continuously in this specific loop.

Yet, we would like Ernest to be smarter and use a bit more determination to find blue squares. The next step might be of learning to recognize specific locations in space. Learning persistence of spatial locations might be an interesting prerequisite before learning persistence of objects. To manage to recognize specific locations, Ernest will need a better visual system.

Friday, January 7, 2011

The tangential strategy learning process

To get a better view on how Ernest learned the tangential strategy, let us examine his activity trace:

1 2(> |+) 3(> |+) 4(> |+) 5(> |+) 6(> |+) 7(> |o) 8(v |*) 9(v*|o) 10(>+| ) 11(^ |*) 12(^o| ) 13(^ |o) 14(^*| ) 15(>o| ) 16(>) 17(>) 18(^*| ) 19(vo| ) 20(v) 21(^) 22(>) 23(^*| ) 24(^o|*) 25(^ |o) 26(^) 27(v) 28(v |*) 29(^ |o) 30(^) 31(>) 32(^*| ) 33(^o|*) 34(v*|o) 35(>+| ) 36(^o|*) 37(v*|o) 38(>+| ) 39(^ |*) 40(v |o) 41(>o| ) 42(>) 43(^*| ) 44(^o|*) 45(> |+) 46(> |+) 47(> |o) 48(v |*) 49(v*|o) 50(>+| ) 51(^ |*) 52(>+|+) 53(v |o) 54(vo| ) 55(v |*) 56(v*| ) 57(v |o) 58(^ |*) 59(>+|+) 60(>+|+) 61(>+|+) 62(>x|x) 63(>o|o) 64(v) 65(v) 66(v) 67[v] 68(^) 69[v] 70(v) 71(v) 72(v) 73[v] 74[>] 75(^*| ) 76(^o|*) 77(> |+) 78(> |+) 79(> |+) 80(> |+) 81(> |+) 82(> |o) 83(v |*) 84(v*|o) 85(>+| ) 86(^ |*) 87(>+|+) 88(>x|x) 89(^o|o)

In this trace, the numbers indicate the cycle counter also displayed in the bottom-right corner of the video. The symbols that represent Ernest’s primitive actions read as follows: ^ turn left, > try to move forward, v turn right. These are within parentheses when they succeed and within angle brackets when they fail. For example, Ernest turned toward an adjacent wall on step 73 and bumped a wall on step 74; in all other steps in this trace, primitive schemas succeeded.

The symbols that represent the eye signals read as follows: * appear, + closer, x arrived, o disappear. These symbols are represented on each side of a | character, the left eye signal being on the left and the right eye signal on the right. For example, on step 9, Ernest turned right, the blue square appeared in the left eye’s field and disappeared from the right eye’s field. On step 10, the blue square got closer in the left field and nothing changed in the right field. On step 11, the blue square appeared in the right field and nothing changed in the left field, meaning that the blue square was then present in both eyes’ fields.

The first interesting (safe and satisfying) sequence was found right at the beginning when Ernest moved forward and got closer in the context where he had just moved forward and gotten closer. This experience made him repeat this sequence from step 2 to step 7 when he received a disappear signal from the right eye.

From step 7 to step 11, Ernest learned the returning sequence: step 7: Move forward, disappear on right. 8 : Turn right, appear on right. 9 : Turn right, appear on left, disappear on right. 10: Move forward, closer on left. 11: Turn left, appear on right. After step 11, Ernest is facing the blue square but doesn’t yet know to move forward in this category of context, and he randomly picked a turn action.

On steps 47 through 51, Ernest enacted again the returning sequence because it had proven to work and to be satisfying in the category of context where he finds himself again. On step 52, he choose to move forward (other options had already proven uninteresting in the current category of context), obtaining a closer signal from both eyes. On step 53, however, he does not yet know to continue moving forward in the current category of context and he randomly picks turn right.

On step 59, he again got closer in both eyes when moving forward (although out of a different preceding sequence). In this context, he picked again move forward on step 60, which proved satisfying, engaging him to continue on step 61 until he stepped on the blue square on step 62.

When the second blue square is introduced on step 75, he has thus already learned to enact the different subsequences needed for the tangential strategy, as well as to categorize contexts accordingly. In effect, he uses these different subsequences in the right way until he reaches the second square on step 88.

This quick learning was somewhat lucky but we choose to report it because it led to a clean example of the tangential strategy. In other runs, Ernest may learn mixed strategies that are less prototypical. This run was, however, not so extraordinarily lucky because behaviors are not picked randomly but rather always exploit what has been learned thus far. Chance is only used to untie conflicting impulses when they cannot be untied from previous knowledge.

Experience shows that Ernest always learns a strategy within the first hundred steps, and that the most frequently found strategy is the diagonal strategy.

The tangential strategy

In this example video, Ernest 8.2 found a strategy that we named the tangential strategy.

The tangential strategy consists of approaching the blue square in a straight line as opposed to a diagonal line (the diagonal strategy in the previous example). The trick with the tangential strategy is that Ernest cannot know when he should turn toward the blue square until he passed it. The tangential strategy thus consists of moving on a straight line until the blue square disappears from the visual field, then returning one step backward, and then turning toward the blue square.

The emergence of a specific strategy occurs during Ernest's youth while he his babbling relatively randomly, in parallel to the emergence of goals. See the details in the next post. When Ernest has organized behavioral patterns that proved both satisfying and robust, he adopts them and stick to them as long as they work.

These results demonstrate that:
a) Ernest does not encode strategies nor task procedures defined by the programmer, as opposed to traditional cognitive modeling.
b) Ernest instances are capable of "individuating" themselves through their experience, i.e., acquiring their own cognitive individuality that was not encoded in their "genes". This accounts for the role of individual experience in cognitive development.
c) Ernest's goals emerge from his low-level drives. Eating blue squares appears to the observer as becoming the goal of Ernest' life while no representation of such goal was encoded into Ernest. Indeed, Ernest was given a high incentive to step on blue squares but this incentive was not different in nature from other primitive drives. Ernest's goals were not pre-encoded as they are, for example, in the goal buffer of the ACT-R architecture.

Wednesday, January 5, 2011

Ernest 8.2 can find his food

Ernest 8.2 is a horseshoe crab. Horseshoe crabs are archaic arthropods whose visual system has been extensively studied. From these studies, we pulled several principles that guided the development of Ernest's distal sensory system:

- Small matrix resolution: the horseshoe crab's most elaborated eyes (two compound eyes among the 10 eyes that horseshoe crabs possess) have a resolution of roughly 40*25 pixels.
- Fixed eyes: eyes are fixed to the animal's body. The animal has to rotate its full body to move its visual field.
- Sensibility to movement: the signal sent to the brain does not reflect static shape recognition but rather reflects changes in the visual field.
- Visio-spatial behavioral proclivity: male horseshoe crabs move toward females when they see them with their compound eyes whereas females move away from other females.

As noted earlier, Ernest's "eyes" have only one pixel — pixel sensible to the distance to the blue square in a 90° visual field (assuming there is only one blue square).

Each eye produces a signal that represents the change in the corresponding visual field during the last interaction cycle:
- Appear: a blue square appeared in the visual field.
- Closer: more blue in the visual field, meaning the blue square is approaching.
- Arrived: the blue square occupies the entire visual field, meaning Ernest is stepping on the blue square and can eat it.
- Disappeared: the blue square disappeared from the visual field.

As opposed to previous versions, Ernest 8.2 has no antenna and has only three possible primitive behaviors:
- [move forward, succeed, 0] Ernest is indifferent of moving forward.
- [move forward, fail, -8] Ernest hates bumping walls.
- [turn left or right, succeed, 0] Ernest is indifferent of turning toward an adjacent empty square.
- [turn left or right, fail, -5] Ernest dislikes turning toward an adjacent wall.

To generate a visio-spatial behavioral proclivity, Ernest's sequential learning mechanism receives an additional inborn intrinsic satisfaction when an eye returns a signal:
- [Appear, 15] Ernest loves blue squares appearing in an eye's visual field.
- [Closer, 10] Ernest enjoys blue squares getting closer.
- [Arrived, 30] Ernest is crazy about stepping on a blue square (and eating it in the process).
- [Disappear, -15] Ernest hates blue squares disappearing from an eye's visual field.

At the beginning, the video shows Ernest learning to coordinate his actions with his sensory input. As before, he needs to learn to generate expectations associated with actions (e.g., turning schemas may shift the blue square from one eye to the other, etc.). He also needs to learn sequences of behavior (or "strategies") to reach the blues square. In this example run, he learned a strategy consisting of following a diagonal, and subsequently a straight line. Other strategies are possible that we will report next.