Artifical Developmental Learning: July 2009

Thursday, July 9, 2009

ERNEST

EVOLUTIONIST: A "trial and error" method that keeps what works.
PRAGMATIC: Knowledge is used to fulfill goals and satisfactions: "Meaning is use".
SELF-ORIENTED: An "unsupervised learning" that may however use pedagogical situations.
LEARNING: A knowledge acquisition that participates to the agent's development.
CONSTRUCTIVIST: as opposed to "Platonist", knowledge is not "discovered" but "constructed".
BOTTOM-UP: higher-level goals are constructed to better fulfill lower-level inborn satisfactions.

Wednesday, July 8, 2009

Ernest 6.4

Ernest 6.4 is the same as Ernest 6.3 except that he has four elementary schemas: move forward, turn right, turn left, and sense. Move forward succeeds if there is no wall ahead and Ernest can move, it fails if he bumps into a wall. Turn left or right never fail. Sense succeeds if there is a wall ahead and fails if there is not. Ernest 6.4 can only sense the square just in front of him, we can figure this sense as an antenna.

Ernest also has six elementary acts with the following satisfaction values:
- (Move forward, Succeed, 5) He "loves" moving forward.
- (Move forward, Fail, -5) He "hates" bumping into walls.
- (Turn left or turn right, Succeed, -1) He does not "like" so much turning.
- (Sense, Succeed, -1) He does not "like" so much sensing a wall in front of him.
- (Sense, Fail, 0) He is "indifferent" of sensing an empty square in front of him.

In this environment, when Ernest performs a sense schema, the sensed square flashes yellow. When he bumps into a wall, the bumped wall flashes red and Ernest says "Ouch!".

At the beginning, Ernest does not know the connection between the sense schemas and the move schemas. But he progressively learns that when a sense schema succeeds he should not perform a move forward schema because it will fail. Then, he learns a good second-order schema consisting of performing a sense schema before moving forward. This second-order schema can be seen as an efficient "strategy" to avoid dissatisfaction.

When an abstract schema fails during its enaction, Ernest sais "No!" and abandon the schema. When an abstract schema fully succeeds he says "Good!". So "No"s and "Good"s indicate that abstract schemas start being enacted.

In this video, we can see that Ernest first learns a first layer of abstract schemas made of the sequence sense - move forward, then he learns a second layer that gives him the best satisfaction he can get in this environment by moving forward twice and turning once.

Interestingly, Ernest does not make any distinction between perception and action. He only has perceivomotor schemas that can succeed or fail. I had to reprogram his environment to handle this mechanism, because current available environments (such as those provided in the Soar package) are based on the classical perception/computation/action cycle.

Despite many authors are saying that perception and action should be kept embedded (since Piaget or before), to my knowledge, Ernest is the first implementation to do so, isn't it?

Friday, July 3, 2009

Ernest 6.3

Now, Ernest 6.3 has a full recursive hierarchical schema mechanism implemented.

It explicitly implements the notion of "act", that is defined as a triple (schema, status, satisfaction). For example, the second line of the trace indicates that act A1 has been enacted, meaning that schema S2 was enacted with a resulting status of F (Fail) corresponding to a satisfaction of -1.

The two elementary schemas are now called S2 (doing A) and S3 (doing B). Ernest is initialized with four elementary acts: A1=(S2,F,-1), A2=(S2,S,1), A3=(S3,S,1), and A4=(S3,F,-1).

Now, a schema is defined as a triple (context act, intention act, weight). For example, the third line indicates that the schema S4 was constructed with the context act A2, the intention act A1, and the weight 1 (the context is arbitrarily initialized in Ernest's short-term memory as A2). Thus, S4 expects S2 to fail in a context where S2 has succeeded.

The third line also indicates that an act A5 was constructed. A5=S4S0 means that S4 has succeeded and has a satisfaction value of 0. The satisfaction value of an act is computed as the sum of the satisfaction values of its schema's context and intention. Satisfaction(A5) = satisfaction(A2) + satisfaction(A1) = 1 - 1 = 0.

The context is the list of the acts that has just been enacted. Lines 4 and 5 indicate that after the first cycle, the context is made of A1 and A5.

In this trace, we can see that after a while, Ernest stabilizes on the sequences S2F-S2S-S3F-S3S. Then he aggregates it as S8S-S16S. Then he aggregates this sequence as S24 = ( A9=S8S , A17=S16S , 1 ) and keeps on enacting S24S. Then, he learns S112 = ( A25=S24S , A25=S24S , 1 ) and he keeps on exploring until he is turned off.

Interestingly, to prevent complexity explosion, I made the context not include subschemas that are more than one level below the enacted schema. For instance, when S24 is enacted, the resulting context does not include S3S, despite it has also been enacted as S24 last sub-sub-schema. This can be understood as Ernest being only "aware" of the top-level enacted schemas. The lower-level schemas are enacted "unconsciously" from Ernest's viewpoint, unless they fail, in which case they would pop-up again in the context.

I think this recursive mechanism makes Ernest virtually able to learn any regularity in his environment, which excites me a lot. The learning time would probably explode when the regularity gets arbitrarily complex, but that's ok because we only expect cognitive agents to learn regularities that are hierarchically structured.

Artifical Developmental Learning