In this section, we discussed functionalities of and mutual dependencies between perception, action, and communication. While some important observations were derived, the presented model is very naive and needs to be improved in the following points:
As a step toward the complete AVA model with these points into account, Asada[15], a core member of the project, proposed the following linear dynamic system to model an embodied AVA without message exchange capability. In what follows, we will discuss its characteristics and limitations to clarify the state-of-the-art and future technical problems.
where ,
, and
denote n dimensional
state vector, m dimensional action code vector, and q
dimensional percept vector, respectively.
and
represent n and q dimensional noise vectors respectively.
Equation (11) represents a practical implementation
of equations (7) and
(8). Note that in equation
(11), the action code vector is
introduced as an descriptive embodiment of the action, while in
equations (7) and
(8), the action is modeled as a mapping
function.
Comparing equations (1) and
(12), in the former is replaced with
in the latter. This means that the above linear model
assumes that the AVA's action is directly and immediately reflected
onto its percept. This assumption implies the elimination of the world
state
and hence the neglection of the information flow
from ActionToWorld to Perception via
illustrated in
Fig. 6. Consequently, the applicability of
the model is limited to rather simple worlds. To cope with this
limitation, Asada applied the model to each object in the world;
different linear models were prepared for different objects in the
world.
The above linear model has a more crucial limitation. It basically
considers as input and
as output. That is,
while the model represents how AVA's action is reflected onto the
percept (i.e. action-driven perception process), its reciprocative
process, i.e. how the percept is used to change the state and then
select the action (i.e. perception-driven state-change process
followed by action-selection process), is not modeled explicitly. To
compensate this limitation, a pair of functions to map percept
into state
and then state
into action
were incorporated. The former function is described by a
transformation matrix analytically derived by Canonical Variate
Analysis. The latter function, on the other hand, is represented by an
associative memory where tuples of (
) are
stored. This representation is acquired from training data through Q
learning and allows the flexible activation of actions depending on
the internal state.
Finally we should note the meaning of the state. In the above linear
system, the state vector represents the instantaneous
state, while
we used means the persistent state
denoting the memory. That is, equations (11) and
(12) merely define a filter function without memory.
We believe that AVAs should have memories to work adaptively to
a wide spectrum of situations in the real world. In fact, Asada
extended the input and output vectors of the system into the followings
to characterize temporal features during a certain length of time period.
Note that these extended vectors really model the first-in-first-out memories (i.e. queues) of size l and k respectively.
As is shown in the discussions given in this section, our study to explore the integration of perception, action, and communication has only made a little step forward. Deep considerations over a wide spectrum of different disciplines including control theory, software science, psychology, and linguistics are required to complete the model.