Third International Workshop on Cooperative Distributed Vision

Presentations by Project Members

Dynamic Memory: Architecture for Real Time Integration of Visual Perception, Camera Action, and Network Communication

Takashi Matsuyama (Kyoto University)

This paper proposes a novel scheme of active vision named Dynamic Vision. It is best characterized by rich interactions between visual perception and camera action modules. The richness is twofold: 1) Rich information is exchanged between the modules to realize both stable image processing and adaptive camera control. 2) Rich dynamic interactions between the modules are realized without disturbing their own intrinsic dynamics. To implement a dynamic vision system, we propose the Dynamic Memory Architecture, where perception and action modules share what we call the Dynamic Memory. It maintains not only continuous temporal histories of state variables such as pan-tilt angles of the camera and the target object location but also their predicted values in the future. Perception and action modules are implemented as parallel processes which dynamically read from and write into the memory according to their own individual dynamics. The dynamic memory supports such asynchronous dynamic interactions (i.e., data exchanges between the modules) without making the modules wait for synchronization. A prototype system for real time moving object tracking demonstrated the effectiveness of the proposed idea.

Distance Learning Environment based on the Interpretation of Dynamic Situation of Lecture Room

Michihiko Minoh and Yoshinari Kameda (Kyoto University)

Based on the research of CDV project, we are constructing a distance learning environment which will be in practical use. The purpose is to evaluate our imaging method, particularly the dynamic situation, in the context of the distance learning and to improve the method. Practical problems are also discussed.

Multiple Reward Criterion for Cooperative Behavior Acquisition in a Multiagent Environment

Eiji Uchibe and Minoru Asada (Osaka University)

The vector-valued reward function is discussed in the context of multiple behavior coordination, especially in a dynamically changing multiagent environment. Unlike the traditional weighted sum of several reward functions, we define a vector-valued value function which evaluates the current action strategy by introducing a discounted matrix to integrate several reward functions. Owing to the extension of the value function, the learning agent can estimate the future multiple reward from the environment appropriately not suffering from the weighting problem. The proposed method is applied to a simplified soccer game. Computer simulations are shown and a discussion is given.

Task-Model Based Human Robot Cooperation Using Vision

Hiroshi Kimura, Koichi Ogawara and Katsushi Ikeuchi
( University of Electro-Communications, ** University of Tokyo)

In order to assist the human, the robot must recognize the human motion in real time by vision, and must plan and execute the needed assistance motion based on the task purpose and the context. In this research, we tried to solve such problems. We defined the abstract task model, analyzed the human demonstration by using events and an event stack, and automatically generated the task models needed in the assistance by the robot. The robot planned and executed the appropriate assistance motions based on the task models according to the human motions in the cooperation with the human. We implemented the 3D object recognition system and the human grasp recognition system by using the trinocular stereo color cameras and the real time range finder. The effectiveness of these methods was tested through an experiment in which the human and the robotic hand assembled toy parts in cooperation.

A Method for Fine Registration of Multiple View Range Images to Reconstruct 3D Total Object Shape

Koichiro Deguchi (Tohoku University) and Ikuko Shimizu (Saitama University)

We present a new method for fine registration of two range images from different viewpoints that have already been roughly registered. Our method takes into account the characteristics of the measurement error of the range images. The error distribution is different for each point of the image and is usually dependent on the viewing direction and the distance to the object surface. We represent one of the two range images by a set of triangular patches. We find the best transformation of two range images and the true position of each measured point so that the measured points of the second range image lie on the surface of triangular patches.
For given transformation, each measured point is corrected to the true position. The direction of this correction is according to the distribution of the measurement error and the amount of this correction is according to its variance. The best transformation is selected by the evaluation of the facility of this correction of each measured point.
The experiment results showed that our method produced better results than the conventional ICP methods.

Real Environment Testbed for Cooperative Distributed Face Registration

Takekazu Kato, Yasuhiro Mukaigawa and Takeshi Shakunaga (Okayama University)

We have proposed a concept of cooperative distributed registration/recognition for effective face registration/recognition in natural environments. In the cooperative distributed registration, distributed cameras are effectively used for taking a variety of face images. Each camera cooperatively selects a suitable target person according to the location and pose as well as registered face images of the target person. This paper presents experimental results on a testbed system with 12 cameras in real environments.

RPV: A Programming Environment for Real-time Parallel Vision on PC Cluster --- Specification and programming methodology ---

Rin-ichiro Taniguchi, Daisaku Arita and Satoshi Yonemoto (Kyushu University)

Real-time parallel image processing and analysis on PC-cluster requires data transfer, synchronization and error recovery. However, it is difficult for a programmer to describe these mechanisms. To solve this problem, we are developing a programming tool for real-time image processing on a PC-cluser called RPV (Real-time Parallel Vision). Using the programming tool, a programmer indicates only data flow between PCs and image processing algorithms on each PC. In this paper, we outline specification of RPV and explain its programming methodology.

Real Time 3D Shape Reconstruction using PC Cluster System

Shogo Tokai, Toshikazu Wada and Takashi Matsuyama (Kyoto University)

The volume intersection using silhouette images observed by multiple cameras is one of the popular concepts to reconstruct the 3D shapes of objects in the scene. In addition to a number of cameras, an efficient algorithm has to be developed to accurately reconstruct the 3D shape in real time based on this concept. In this paper, we propose a novel approach to real time 3D reconstruction based on the volume intersection. Our approach is twofold: one is improving the reconstruction method by using a plane-to-plane linear projection, and the other is implementing parallel algorithms on a PC cluster system that are suitable for the system architecture. We show experimental results with our PC cluster system that consists of 9 pan-tilt-zoom cameras and 10 PCs connected by a high speed network.

Background Subtraction under Varying Illuminations

Takashi Matsuyama (Kyoto Univ.), Hitoshi Habe (Mitsubishi Electric Corp.), Ryo Yumiba (Hitachi, Ltd.) and Kazuya Tanahashi (Kyoto Univ.)

The background subtraction is a simple but effective method to detect moving objects in video images. However, since it assumes that image variations are caused only by moving objects, its applicability is limited. In this paper, we propose a robust background subtraction method under varying illumination. To augment the background subtraction under varying illumination, we focus on illumination-invariant features called as texture and normalized intensity. We integrate detection results using the features, and realize the robust background subtraction method under varying illumination. Experimental results of the method demonstrate its robustness and effectiveness for real world scenes.