Olivier Georgeon's research blog—also known as the story of little Ernest, the developmental agent.

Keywords: situated cognition, constructivist learning, intrinsic motivation, bottom-up self-programming, individuation, theory of enaction, developmental learning, artificial sense-making, biologically inspired cognitive architectures, agnostic agents (without ontological assumptions about the environment).

Monday, November 24, 2008

Ernest 4.1's abstraction

This figure illustrates how Ernest learns knowledge from his activity, and in parallel, how this knowledge helps him better control his activity. It shows the same trace as in Enest 4.1 video.

The raw activity is represented at the bottom. It is an alternance of A or B done by Ernest and of X or Y returned by the environment.

The ascendant blue arrows represent the construction of more abstract items.

The first abstraction level is called "Acts". An act corresponds to a cycle Ernest-Environment. A new type of act is constructed when a new combination Ernest-Environment is encountered.

The second abstraction level is "primary schemas". A primary schema is made up of two acts: the context and the action. The action act can be seen as a raw action associated with a raw expectation. The context act triggers the schema, and the schema tries to control the action act, but sometimes it fails. It is only at the end of the action act that the actually enacted primary schema is completely known. Hence, there is a tightly coupling between these two levels which is represented by dash gray double arrows: Trigger/Control.

The third abstraction level is "secondary schemas". Secondary schemas are made up of three primary schemas: context, action and expectation. The context primary schema triggers the secondary schema, and the secondary schema tries to control the action primary schema. In this environment, secondary schemas always succeed, so Ernest becomes "in control" of his activity when secondary schemas start to be enacted.

Sunday, November 23, 2008

Ernest4.1's viewpoint

The cyclic representation gives a good idea of Ernest's implementation, but a poor idea of how Ernest learns and controls his activity.

Moreover, the cyclic representation gives the impression that Ernest'a architecture falls into the classical "perception -> cognition -> action -> perception" cycle, which is not really the case. I don't think that the "Assess environment's response" phase can be understood as a perception phase, nor that the "Execute selected schema step" phase can be seen as an action phase. The phases between them do not correspond to classical cognitive problem solving either.

Obviously, at this point, Ernest has not yet constructed the idea that an external world exists outside himself. From Ernest's viewpoint, perception and action has not yet any sense. Thus, the only psychological attribute we can grant him is what philosophers would call a "phenomenological experience", which is a flow of phenomenons that he experiences.

I think an unfold timeline is more efficient to represent this phenomenological experience than the cyclic representation, because it better shows the interwaving of abstraction, learning and control.

Ernest4.1's cycle

Construct context: consists of structuring the current context to prepare the schema construction and the schema selection. The context is made up of the three previously enacted schemas, stored in short-term memory. These schemas can be of any level, and can refer to subschemas. This phase indexes these different levels.

Construct new schemas: This phase consists of creating new schemas that match the current context. If they not yet exist, these new schemas are added to long-term memory. They constitute hypotheses about how to deal with a new context, but they still need to be tested.

Select a schema / First step: consists of selecting a schema to be executed in this context. High-level schemas add weight to their subschemas. Weights are positive if they lead to Y and negative if they lead to X. Schemas of any level compete, and the one with the highest weight is selected. If there are several ex aequo, one of them is randomly picked. This phase initialize the selected schema at its first step.

Execute schema step: sends the selected action defined in the current schema step to the environment: A or B.

Environment: computes the response from the environment and send it to Ernest: Y or X. The environment has his own memory and cycle.

Assess environment's response: checks if the schema has succeeded or failed. If the current subschema has succeeded but it is not the last step of the selected schema, then the "ongoing" loop is selected.

Next step / subschema: selects the next step in the subschema hierarchy of the selected schema.

Memorize / reinforce schema: when the schema ends up, if it has succeeded, then it is referred as the last enacted schema in short term memory. The previous two are shifted, and the previous third is drop out of short-term memory.
If the schema has failed at some point, then the actually enacted schema is stored in short-term memory and reinforced in long-term memory. For exemple, if a primary schema expecting Y has been selected, but if the environment actually returned X, then the same schema but with an expectation of X is actually memorized and reinforced.

Trace: is just used to generate the trace of this cycle and to clear the temporary variables.

Monday, November 17, 2008

Ernest 4.1

video
Like Ernest 4.0, Ernest 4.1 learns and exploits second-order schemas to succeed in the "aA..bB..aA..aA" environment.

The way schemas are constructed and enacted has been however slightly modified.

At the beginning of each round, Ernest 4.1 evaluates the context, and if the context is new, he constructs potentially interesting schemas that match this context. For exemple, the intial context is made up of "nils" because Ernest has no initial experience, but this context is structured under the form of a previous act A2 and a previous schema S5 (first line). An act is a couple (Ernest's action, Environment's response). From this context, four initial primary schemas are constructed: S6, S7, S8, S9. Finally, one schema is enacted: in this example S9. Like, Ernest 4.0, if the schema gets X, Ernest 4.1 sais O-o, and if it gets Y, he sais Yee!

The context is the content of Ernest's short-term memory, it corresponds to the three previously-enacted schemas. For example, in this trace, the third context contains S5, S9 and S12. These schemas can be of any level of abstraction. For convenience, the third schema is expanded: S12=(A3,A4). A3 is S12's context and A4 is S12's act. A4 is expanded: A4=(B,Y), meaning that the last thing Ernest did was B and he got Y from the environment.

Like Ernest 4.0, Ernest 4.1 constructs a secondary schema each time a primary schema succeeds. This secondary schema is made up of the three previously-enacted primary schemas. That also corresponds to the context stored in short-term memory.

Later, when a new context matches a secondary schema, this secondary schema is enacted, if he has the highest weight amongst all the schemas matching this context, at any level.

Contrary to Ernest 4.0, Ernest 4.1 does not trace the details of the secondary schema enaction. A secondary schema enaction is displayed as a single line in the trace, despite it actually takes two rounds. Moreover, only the secondary schema is reinforced and stored in short-term memory, but not the primary schemas that are part of it. Thus, when a secondary schema is enacted, short-term memory is not filled with lower-level details. This mechanism allows Ernest to construct tertiary schemas.

At the end of this trace, we can see the construction of tertiary schemas, made up of the three previously-enacted primary or secondary schemas. In this environment, however, these tertiary shemas are not used because there is no reason for them to receive more reinforcement than the secondary schemas, and anyway, Ernest 4.1 is not yet able to recursively manage nested schemas.

It is interesting to notice that, as this trace only displays the highest-level schema enaction, it can again be understood as a description of Ernest's viewpoint on his activity. It is as if Ernest was always focusing on the highest level of control, and automatically performing lower-level behavior without having to pay attention to it.

Thursday, November 6, 2008

Ernest 4.0

video
Ernest 4.0 can learn and exploit second-order schemas in a way that let him solve the "aA..bB..aA..aA" problem.

This trace is very similar to Ernest3.2's, except from that it shows the recalls of second-order schemas (orange lines). This happens when it exists a second-order schema having a context equal to the primary schema that has just been enacted.

For example, the first recalled second-order schema is S14 (a few lines after the first screen of this video). The reason why S14 is recalled is because it has a context schema S8 equal to the primary schema that was enacted just before. When recalled, S14 is being enacted, forcing its action schema S10 to be enacted, despite the fact that S10 expects X. We can think that Ernest is not so happy to get this X, but at least, it is what he expected. He expresses his "resignation" by "Arf" (in grey). Then, at the next round, Ernest can enact S12 as expected by S14, and he gets Y.

We can see that Ernest finally finds the regularity "aAbBaAbB" that gets him a Y every second round, which is the best he can get in this environment.

Ernest can now find regularities that are twice as long as his short-term memory. This is possible because he aggregates sub-regularities into primary schemas that can be referenced as single items in short-term memory. These items are affordance representations that Ernest can manipulate in short-term memory. In that sense, these representations constitute a first level of abstraction from Ernest's viewpoint.

Monday, November 3, 2008

Ernest 3.2

video
Ernest 3.2 can now learn second-order schemas, i.e. schemas of primary schemas. However, he does not yet know how to use them.

In this trace, each line, except the yellow ones, represent a cycle Ernest-Environment. We can understand these lines as "affordances", that is, situations that give rise to some behavior. The weights that triggered this behavior are displayed in orange, the resulting primary schemas are in blue. Each of these affordances has an assessment from Ernest's viewpoint: "Yee!", "Boo.", "A-a!" or "O-o", as explained in previous post.

Like primary schemas, second-order schemas are triples: (context, action, expectation), but now, each of these three elements are affordances. When an affordance is assessed "Yee!" or "A-a", it triggers the learning of a second-order schema (in yellow), made up of the two previously-enacted affordances, appended to the triggering one.

For example, at the beginning of this trace, the second-order schema S12, is made up of schemas S6, S8, and S10 : S6 is S12's context affordance, S8 is his current affordance, and S10 is his expected affordance. That means that when Ernest encounters a situation where S6 has been enacted, he should enact S8 because that should bring him to a situation where S10 can be enacted, and S10 will bring one of those delicious Ys!

Implementing the exploitation of these second-order schemas, however, still raises many questions. How primary schemas and second-order schemas should compete to trigger behavior? How reinforcement should be distributed between these two levels? What when a second-order schema fails? In addition, second-order schemas should also constitute more abstract affordances that could be taken as higher-level schema elements.