Partners: SZTAKI, UPC, Bilkent, ACV
Visual surveillance and activity analysis has attained great interest in the field of computer vision research. Several algorithm libraries are available on-line (open-source or proprietary), however their integration into a complex system is hindered by the inhomogeneity of the implementation language, format, processing speed, etc. The aim of this work is to produce a flexible, transparent system for activity analysis. The system provides a transparent interface to heterogeneous modules with different input-output requirements. The setup is hierarchical thus helping the scalability of the whole framework. The actual implementation integrates diverse algorithms forming a test-bed for unusual activity detection. Various complex surveillance related algorithms, such as human and body action, tracking and motion activity algorithms are integrated into one system.The architecture according to the current trend and software tools is as flexible as possible. The modules can be distributed over the network; they are organized into a hierarchical structure. The structure can be separated into four main entities: a) the client’s web interface, b) the server (possibly but not necessarily including the web server) c) the controller and d) the communication interface embedded into the user module (see Fig.2). Each component operates autonomously communication through RPC requests over TCP/IP.
Detector modules to be integrated:
(i) Human model and motion based unusual event detection (UPC): In order to achieve a simple motion representation, the concept of Motion History Image (MHI) and Motion Energy Image (MEI) was introduced. This representation has been recently used for monocular gait recognition tasks and activity modeling. We have extended this formulation to represent view-independent 3D motion. A simple ellipsoid body model was fit to the incoming 3D data to capture in which body part the gesture occurs thus increasing the recognition ratio of the overall system and generating a more informative classification output. Data produced by the body and motion analysis modules is processed in order to extract a vector of features for classification. Statistical moments invariant to scaling, translation, rotation and affine mappings have been used. We constructed a 12-dimensional feature vector. For each scenario, this feature vector is trained for the usual events (people walking and people standing for instance) using a mixture of Gaussians probability model. The detection of unusual events is based on a classification of each feature vector as belonging to this model or not.
(ii) Non-parametric clustering for object detection (ACV): Fast mean shift-based clustering in 2D digital images is introduced using integral images. The fast clustering step is used to delineate objects directly in a difference image obtained by a standard adaptive background subtraction technique in an automated visual surveillance system. A novel occlusion handling scheme is implemented, which significantly improves the tracking performance even in the presence of a large overlap between objects.
(iii) Kernel-based tracking using motion features for multiple targets (ACV): A kernel-based fast tracking algorithm was applied to the track density maxima in a difference image. The principal advantages of this tracking strategy are: (1) the data association problem is solved implicitly, since the mode seeking procedure is guided to the nearby mode along the steepest density gradient.
(iv) Multi-modal Method for Detecting Fight Among People at Unattended Places (BILKENT): Recently, intelligent video analysis systems capable of detecting humans, cars etc were developed. Such systems mostly use HMMs or SVMs to reach decisions.
They detect important events but they also produce false alarms. It is possible to take advantage of other low cost sensors including audio to reduce the number of false alarms. Most video recording systems have the capability of recording audio as well. Analysis of audio for intelligent information extraction is a relatively new area. Automatic detection of broken glass sounds, car crash sounds, screams, increasing sound level at the background are indicators of important events. By combining the information coming from the audio channel with the information from the video channels, reliable surveillance systems can be built.
(v) Unusual motion pattern detection (MTA-SZTAKI): Intelligent visual surveillance is an increasingly important part of computer vision research. One of the most important goals of visual surveillance systems is to analyze the activity of the observed objects in order to detect anomalies, predict future behaviors, or predict potential unusual events before they occur. There have been a lot of approaches to model the activity of dynamic scenes. Analysis of motion patterns is an effective approach for learning the observed activity. For the most of the time, objects in the scene do not move randomly. They usually follow well-defined motion patterns. Knowledge of usual motion patterns can be used to detect anomalous motion patterns of objects.
Flash demo of the fight detector algorithm. Please choose from the list on the left.
Contributors István Petrás,Csaba Beleznai,Yiğithan Dedeoğlu, Montse Pardàs, Levente Kovács, Zoltán Szlávik László Havasi Tamás Szirányi B. Uğur Töreyin Uğur Güdükbay A. Enis Çetin Cristian Canton-Ferrer