Person Re-Identification in Security Camera Networks
|Duration:||1/2017 – 12/2019|
Introduction and Methodology
Person re-identification (PRID) is a crucial component for multi-camera networks in different real world applications such as surveillance, automation, and robotics.
The fundamental re-identification problem involves matching a person of interest observed in a "probe" camera to an image or video of the same person from a "gallery" of candidates captured at a different time by another camera, which does not necessarily have a field of view overlapping with that of the probe camera. Recently, considerable progress has been achieved in this domain, resulting from the development of relevant computer vision and machine learning algorithms.
Despite the good performance of face recognition (e.g. railway station Berlin-Südkreuz) and gait recognition, we follow a person appearance based strategy to re-identify also uncooperative persons and handle tough camera viewpoints. Furthermore, we address only short-term PRID, which cannot handle issues like cloth variations. Short-term PRID is useful for multi-camera tracking in public spaces as airports and railway stations.
Today, algorithms designed for supervised learning outperform unsupervised learning approaches by a large margin and additional information from active sensors such as the Kinect or passive sensors such as stereo cameras can further improve the results.
However, we are still far away from solving the problem, especially in real world applications, where we typically do not have enough training data and suffer from the performance gap between training data and real world data. An essential problem is the conflict between high intra-variations and low inter-variations among different persons: different persons seen from an identical perspective may appear more similar than the same person viewed from different perspectives. In such cases, human operators would not only compare the images as a whole, but also consider details such as the hairstyle, the colour of the shoes or a handbag. These details are typically not visible from all viewing directions.
With practical applications in mind, we propose a novel approach for person re-identification, which exploits multi-view information of fisheye cameras looking downwards from the ceiling (cf. Fig. 1).
Figure 1: Motivation for using a fisheye camera.
Left (central projection): narrow field of view, random camera pose (top), and nadir pose of another scene (bottom). Right (FE), nadir camera pose. While the fisheye sequence contains side, front, top and back views of persons, it is nearly impossible to capture the same with central projection.
We address high intra-variations and low inter-variations of a person by utilizing high-level view information, which is orthogonal to low- and mid-level features.
To integrate the highly variable multi-view information we build a generic pipeline for fisheye cameras based on geometric sensor modelling and deep learning.
Detailed information regarding our work can be found in our publications.
Blott, G.; Heipke, C. (2017): BIFOCAL STEREO FOR MULTIPATH PERSON RE-IDENTIFICATION. In: International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W8, November 2017, pp. 37-44.
Blott, G.; Takami, M.; Heipke, C. (2018): Semantic Segmentation of Fisheye Images. In: Leal-Taixé, Roth S. (Eds.): Computer Vision – ECCV 2018 Workshops Part I – 6th Workshop on Computer Vision for Road Scene Understanding and Autonomous Driving, München, LNCS 11129, Springer, 181-196.
Blott, G.; Yu, J.; Heipke, C. (2018): View-Aware Person Re-Identification. In: Bronx T., Bruhn A. (Eds.): Pattern recognition – 40th German Conference GCPR Stuttgart, LNCS 11269, Springer, 46-59.