First we define Active Vision Agent (AVA, in short) as a rational agent with visual perception, action, and communication capabilities. Let denote the internal state of ith AVA, and the state of the world where a group of AVAs are embeded.
Then, intuitively, perception and action by a single can be modeled by the following state mapping functions (Fig. 4):
where denotes entities perceived by . We introduce to link and . It denotes the perception-driven and/or autonomous internal state transition by :
Figure 4: A primitive model of interaction between perception and action.
Since includes various camera parameters such as camera position, viewing angle, zooming factor, focus, and so on, depends on ; changes depending on . , on the other hand, can modify its own internal state as well as the state of the world :
We call the implementation of a vacuous AVA an Observation Station: real time image processor with active camera(s) embedded in the world.
The discrimination between embodied and vacuous AVAs plays a crucial role in defining the meaning of communication.