Ordered or orderless: A revisit for video based person re-identification
IEEE transactions on pattern analysis and machine intelligence, 2020•ieeexplore.ieee.org
Is recurrent network really necessary for learning a good visual representation for video
based person re-identification (VPRe-id)? In this paper, we first show that the common
practice of employing recurrent neural networks (RNNs) to aggregate temporal-spatial
features may not be optimal. Specifically, with a diagnostic analysis, we show that the
recurrent structure may not be effective learn temporal dependencies than what we
expected and implicitly yields an orderless representation. Based on this observation, we …
based person re-identification (VPRe-id)? In this paper, we first show that the common
practice of employing recurrent neural networks (RNNs) to aggregate temporal-spatial
features may not be optimal. Specifically, with a diagnostic analysis, we show that the
recurrent structure may not be effective learn temporal dependencies than what we
expected and implicitly yields an orderless representation. Based on this observation, we …
Is recurrent network really necessary for learning a good visual representation for video based person re-identification (VPRe-id)? In this paper, we first show that the common practice of employing recurrent neural networks (RNNs) to aggregate temporal-spatial features may not be optimal. Specifically, with a diagnostic analysis, we show that the recurrent structure may not be effective learn temporal dependencies than what we expected and implicitly yields an orderless representation. Based on this observation, we then present a simple yet surprisingly powerful approach for VPRe-id, where we treat VPRe-id as an efficient orderless ensemble of image based person re-identification problem. More specifically, we divide videos into individual images and re-identify person with ensemble of image based rankers. Under the i.i.d. assumption, we provide an error bound that sheds light upon how could we improve VPRe-id. Our work also presents a promising way to bridge the gap between video and image based person re-identification. Comprehensive experimental evaluations demonstrate that the proposed solution achieves state-of-the-art performances on multiple widely used datasets (iLIDS-VID, PRID 2011, and MARS).
ieeexplore.ieee.org