Abstract
Recent advancement of research in biometrics, computer vision, and natural language processing has discovered opportunities for person retrieval from surveillance videos using textual query. The prime objective of a surveillance system is to locate a person using a description, e.g., a short woman with a pink t-shirt and white skirt carrying a black purse. She has brown hair. Such a description contains attributes like gender, height, type of clothing, colour of clothing, hair colour, and accessories. Such attributes are formally known as soft biometrics. They help bridge the semantic gap between a human description and a machine as a textual query contains the person’s soft biometric attributes. It is also not feasible to manually search through huge volumes of surveillance footage to retrieve a specific person. Hence, automatic person retrieval using vision and language-based algorithms is becoming popular. In comparison to other state-of-the-art reviews, the contribution of the paper is as follows: 1. Recommends most discriminative soft biometrics for specific challenging conditions. 2. Integrates benchmark datasets and retrieval methods for objective performance evaluation. 3. A complete snapshot of techniques based on features, classifiers, number of soft biometric attributes, type of the deep neural networks, and performance measures. 4. The comprehensive coverage of person retrieval from handcrafted features based methods to end-to-end approaches based on natural language description.
Similar content being viewed by others
References
https://en.wikipedia.org/wiki/Alphonse_Bertillon. Accessed 28 April 2020
https://www.bbc.com/news/magazine-22191033. Accessed 28 April 2020
Aggarwal S, Radhakrishnan VB, Chakraborty A (2020) Text-based person search via attribute-aided matching. In: IEEE winter conference on applications of computer vision (WACV), pp 2617–2625
Amayeh G, Bebis G, Nicolescu M (2008) Gender classification from hand shape. In: IEEE computer society conference on computer vision and pattern recognition workshops, pp 1–7
Anguelov D, Lee KC, Gokturk SB, Sumengen B (2007) Contextual identity recognition in personal photo albums. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1–7)
Badawi AM, Mahfouz M, Tadross R, Jantz R (2006) Fingerprint-based gender classification. In: Proceedings of International conference on image processing, computer vision and pattern recognition, pp 41–46
Bak S, Corvee E, Bremond F, Thonnat M (2010) Person re-identification using spatial covariance regions of human body parts. In: 7th IEEE international conference on advanced video and signal based surveillance, pp 435–440
Baltieri D, Vezzani R, Cucchiara R (2011) 3dpes: 3d people dataset for surveillance and forensics. In: Proceedings of the joint ACM workshop on Human gesture and behavior understanding, pp 59–64
Baltieri D, Vezzani R, Cucchiara R (2011) Sarc3d: a new 3d body model for people tracking and re-identification. In: International conference on image analysis and processing. Springer, Berlin, pp 197–206
Bekios-Calfa J, Buenaposada JM, Baumela L (2010) Revisiting linear discriminant techniques in gender recognition. IEEE Trans Pattern Anal Mach Intell 33(4):858–864
BenAbdelkader C, Cutler R, Davis L (2002) View-invariant estimation of height and stride for gait recognition. In: International workshop on biometric authentication. Springer, Berlin, pp 155–167
BenAbdelkader C, Davis L (2006) Estimation of anthropomeasures from a single calibrated camera. In: 7th international conference on automatic face and gesture recognition (FGR06), pp 499– 504)
Benfold B, Reid I (2011) Stable multi-target tracking in real-time surveillance video. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3457–3464
Bertillon (1889) Instructions for taking descriptions for the identification of criminals and others, by means of anthropometric indications. American Bertillon Prison Bureau
Bobick AF, Johnson AY (2001) Gait recognition using static, activity-specific parameters. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 1, pp I–I
Cao L, Dikmen M, Fu Y, Huang TS (2008) Gender recognition from body. In: Proceedings of the 16th ACM international conference on Multimedia, pp 725–728
Cao YT, Wang J, Tao D (2020) Symbiotic adversarial learning for attribute-based person search. arXiv:2007.09609
Chang TH, Gong S (2001) Tracking multiple people with a multi-camera system. In: Proceedings IEEE workshop on multi-object tracking, pp 19–26
Chen D, Li H, Liu X, Shen Y, Shao J, Yuan Z, Wang X (2018) Improving deep visual representation for person re-identification by global and local image-language association. In: Proceedings of the European conference on computer vision (ECCV), pp 54–70
Chen L, Wang Y, Wang Y (2009) Gender classification based on fusion of weighted multi-view gait component distance. In: IEEE Chinese Conference on Pattern Recognition, pp 1–5
Cheng DS, Cristani M, Stoppa M, Bazzani L, Murino V (2011) Custom pictorial structures for re-identification. In: British machine vision conference (BMVC), vol 1, p 6
Childers DG, Wu K (1991) Gender recognition from speech. Part II: Fine analysis. J Acoust Soc Am 90(4):1841–1856
Cho K, van Merriënboer B, Bahdanau D, Bengio Y (2014) On the properties of neural machine translation: encoder–decoder approaches. In: Proceedings of SSST-8, Eighth workshop on syntax, semantics and structure in statistical translation, pp 103–111
Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS. Workshop on Deep Learning
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223
Dantcheva A, Elia P, Ross A (2016) What else does your biometric data reveal? A survey on soft biometrics. IEEE Trans Inf Forensics Secur 11 (3):441–467
Dantcheva A, Velardo C, D’Angelo A, Dugelay JL (2011) Bag of soft biometrics for person identification. Multimed Tools Appl, Springer 51 (2):739–777
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 248–255
Deng Y, Luo P, Loy CC, Tang X (2014) Pedestrian attribute recognition at far distance. In: Proceedings of the 22nd ACM international conference on Multimedia, pp 789–792
Denman SP, Chandran V, Sridharan S (2017) Robust real time multi-layer foreground segmentation. In: Proceedings of international association for pattern recognition (IAPR) conference on machine vision applications, pp 496–499
Denman S, Fookes C, Bialkowski A, Sridharan S (2009) Soft-biometrics: unconstrained authentication in a surveillance environment. In: IEEE Digital Image Computing: Techniques and Applications, pp 196–203
Denman S, Halstead M, Bialkowski A, Fookes C, Sridharan S (2012) Can you describe him for me? a technique for semantic person search in video. In: IEEE international conference on digital image computing techniques and applications (DICTA), pp 1–8
Denman S, Halstead M, Fookes C, Sridharan S (2015) Searching for people using semantic soft biometric descriptions. Pattern Recognit Lett, Elsevier 68:306–15
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Ding L, Martinez AM (2008) Precise detailed detection of faces and facial features. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1–7
Dong Q, Gong S, Zhu X (2019) Person search by text attribute query as zero-shot learning. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 3652–3661
Doretto G, Sebastian T, Tu P, Rittscher J (2011) Appearance-based person reidentification in camera networks: problem overview and current approaches. J Ambient Intell Humaniz Comput 2(2):127–151
Falsetti AB (1995) Sex assessment from metacarpals of the human hand. Journal of Forensic Science 40(5):774–776
Galiyawala H, Raval MS, Dave S (2019) Visual appearance based person retrieval in unconstrained environment videos. Image Vis Comput 92:103816
Galiyawala HJ, Raval MS, Laddha A (2020) Person retrieval in surveillance videos using deep soft biometrics. In: Deep biometrics. Springer, Cham, pp 191–214
Galiyawala H, Shah K, Gajjar V, Raval MS (2018) Person retrieval in surveillance video using height, color and gender. In: 15th IEEE international conference on advanced video and signal based surveillance (AVSS), pp 1–6
Ghalleb AE, Sghaier S, Amara NE (2013) Face recognition improvement using soft biometrics. In: 10th international multi-conferences on systems, signals & devices, vol 2013, pp 1–6
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE International conference on computer vision (ICCV), pp 1440–1448
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 580–587
Golomb BA, Lawrence DT, Sejnowski TJ (1990) Sexnet: a neural network identifies sex from human faces. In: Proceedings of 3rd international conference on neural information processing systems (NIPS), pp 572–577
Gray D, Brennan S, Tao H (2007) Evaluating appearance models for recognition, reacquisition, and tracking. In: Proceedings of IEEE International workshop on performance evaluation for tracking and surveillance (PETS), vol 3. Citeseer, pp 41–47
Gupta S, Rao AP (2014) Fingerprint based gender classification using discrete wavelet transform & artificial neural network. Int J Comput Sci Mob Comput 3(4):1289–1296
Gutta S, Wechsler H, Phillips PJ (1998) Gender and ethnic classification of face images. In: Proceedings of 3rd IEEE international conference on automatic face and gesture recognition, pp 194–199
Halstead M, Denman S, Fookes C, Tian Y, Nixon MS (2018) Semantic person retrieval in surveillance using soft biometrics: Avss 2018 challenge II. In: Proceedings of 15th IEEE International conference on advanced video and signal based surveillance (AVSS), Auckland, New Zealand, 2018 Nov 27, pp 1–6
Halstead M, Denman S, Sridharan S, Fookes C (2014) Locating people in video from semantic descriptions: A new database and approach. In: 22nd IEEE International conference on pattern recognition, pp 4501–4506
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
Hirzer M, Beleznai C, Roth PM, Bischof H (2011) Person re-identification by descriptive and discriminative classification. In: Scandinavian conference on Image analysis. Springer, Berlin, pp 91–102
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861
Hu P, Peng D, Wang X, Xiang Y (2019) Multimodal adversarial network for cross-modal retrieval. Knowl-Based Syst 180:38–50
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 4700–4708
Huang CH, Wu YT, Shih MY (2009) Unsupervised pedestrian re-identification for loitering detection. In: 3rd pacific-rim symposium on image and video technology. Springer, Berlin, pp 771–783
Jain A, Huang J (2004) Integrating independent components and linear discriminant analysis for gender classification. In: Proceedings of 6th IEEE international conference on automatic face and gesture recognition, 2004 May 19, pp 159–163
Jain AK, Dass SC, Nandakumar K (2004) Soft biometric traits for personal recognition systems. In: Proceedings of International Conference on Biometric Authentication (ICBA). Springer, Berlin, pp 731–738
Jain AK, Dass SC, Nandakumar K (2004) Can soft biometric traits assist user recognition?. In: Biometric technology for human identification. International Society for Optics and Photonics, vol 5404, pp 561–572
Jain AK, Flynn P, Ross A (2007) Handbook of biometrics. Springer Science & Business Media, Berlin
Jain AK, Nandakumar K, Lu X, Park U (2004) Integrating faces, fingerprints, and soft biometric traits for user recognition. In: International workshop on biometric authentication. Springer, Berlin, pp 259–269
Jain AK, Park U (2009) Facial marks: Soft biometric for face recognition. In: 16th IEEE international conference on image processing (ICIP), pp 37–40
Jain AK, Ross A, Prabhakar S (2004) An introduction to biometric recognition. IEEE Trans Circ Syst Vid Technol 14(1):4–20
Jia S, Cristianini N (2015) Learning to classify gender from four million images. Pattern recognition letters. Elsevier 58:35–41
K. He, G. Gkioxari, P. Dollar, R. Girshick (2017) Mask R-CNN. In: IEEE International Conference on Computer Vision (ICCV), pp 2980–2988
Kanchan T, Krishan K (2011) Anthropometry of hand in sex determination of dismembered remains-A review of literature. J Forensic Legal Med 18 (1):14–17
Kelly KL, Judd DB (1976) Color: universal language and dictionary of names. US Department of Commerce, National Bureau of Standards
Khatun A, Denman S, Sridharan S, Fookes C (2020) End-to-end domain adaptive attention network for cross-domain person re-identification. arXiv:2005.03222
Kim HC, Kim D, Ghahramani Z, Bang SY (2006) Appearance-based gender classification with Gaussian processes. Pattern Recognit Lett, Elsevier 27(6):618–626
Krishan K, Kanchan T, Sharma A (2011) Sex determination from hand and foot dimensions in a North Indian population. J Forensic Sci 56 (2):453–459
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems (NIPS), pp 1097–1105
Kumar SV, Yaghoubi E, Das A, Harish BS, Proença H (2020) The P-DESTRE: a fully annotated dataset for pedestrian detection, tracking, re-identification and search from aerial devices. arXiv:2004.02782
Lagree S, Bowyer KW (2011) Predicting ethnicity and gender from iris texture. In: IEEE International Conference on Technologies for Homeland Security (HST), pp 440–445
Layne R, Hospedales TM, Gong S (2012) Person re-identification by attributes. In: British machine vision conference (BMVC), British Machine Vision Association, vol 2, p 8.R
Layne R, Hospedales TM, Gong S (2014) Attributes-based re-identification. In: Person Re-identification. Springer, London, pp 93–117
Lazenby RA (1994) Identification of sex from metacarpals: effect of side asymmetry. J Forensic Sci 39(5):1188–1194
Lee JE, Jain AK, Jin R (2008) Scars, marks and tattoos (SMT): Soft biometric for suspect and victim identification. In: IEEE Biometrics symposium, pp 1–8
Li D, Chen X, Huang K (2015) Multi-attribute learning for pedestrian attribute recognition in surveillance scenarios. In: IEEE IAPR Asian Conference on Pattern Recognition (ACPR), pp 111–115
Li X, Maybank SJ, Yan S, Tao D, Xu D (2008) Gait components and their application to gender recognition. IEEE Trans Syst Man Cybern C (Appl Rev) 38(2):145–155
Li SZ, Schouten B, Tistarelli M (2009) Handbook of Remote Biometrics for Surveillance and Security, pp. 3–21 Springer-Verlag. USA, New York
Li S, Xiao T, Li H, Yang W, Wang X (2017) Identity-aware textual-visual matching with latent co-attention. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 1890–1899
Li S, Xiao T, Li H, Zhou B, Yue D, Wang X (2017) Person search with natural language description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1970–1979
Li D, Zhang Z, Chen X, Huang K (2018) A richly annotated pedestrian dataset for person retrieval in real surveillance scenarios. IEEE Trans Image Process 28(4):1575–1590
Li D, Zhang Z, Shan C, Wang L, Tan T (2019) A comprehensive study on large-scale person retrieval in real surveillance scenarios. In: 16th IEEE international conference on advanced video and signal based surveillance (AVSS), pp 1–8
Li W, Zhao R, Xiao T, Wang X (2014) Deepreid: Deep filter pairing neural network for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 152–159
Lin Y, Zheng L, Zheng Z, Wu Y, Hu Z, Yan C, Yang Y (2019) Improving person re-identification by attribute and identity learning. Pattern Recogn 95:151–161
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) SSD: Single shot multibox detector. In: European conference on computer vision (ECCV). Springer, Cham, pp 21–37
Liu H, Feng J, Jie Z, Jayashree K, Zhao B, Qi M, Jiang J, Yan S (2017) Neural person search machines. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 493–501
Liu X, Zhao H, Tian M, Sheng L, Shao J, Yi S, Yan J, Wang X (2017) Hydraplus-net: Attentive deep features for pedestrian analysis. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 350–359
Loper E, Bird S (2002) Nltk: The natural language toolkit. In: ACL-02 workshop on effective tools and methodologies for teaching natural language processing and computational linguistics, ETMTNLP ’02. Association for Computational Linguistics
Loy CC, Xiang T, Gong S (2010) Time-delayed correlation analysis for multi-camera activity understanding. Int J Comput Vis 90(1):106–129
Madden CS, Piccardi M (2005) Height measurement as a session-based biometric for people matching across disjoint camera views. In: Image and Vision Computing Conference. Wickliffe Ltd
Marasco E, Lugini L, Cukic B (2014) Exploiting quality and texture features to estimate age and gender from fingerprints. In: Biometric and surveillance technology for human and activity identification XI. International Society for Optics and Photonics, vol 9075, p 90750F
Martinho-Corbishley D, Nixon MS, Carter JN (2016) Soft biometric retrieval to describe and identify surveillance images. In: IEEE international conference on identity, security and behavior analysis (ISBA), pp 1–6
Martinho-Corbishley D, Nixon MS, Carter JN (2016) Retrieving relative soft biometrics for semantic identification. In: 23rd IEEE international conference on pattern recognition (ICPR), pp 3067–3072)
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems (NIPS), pp 3111–3119
Moghaddam B, Yang MH (2002) Learning gender with support faces. IEEE Trans Pattern Anal Mach Intell 24(5):707–11
Niu K, Huang Y, Ouyang W, Wang L (2020) Improving description-based person re-identification by multi-granularity image-text alignments. IEEE Trans Image Process 29:5542–5556
Nixon M (1985) Eye spacing measurement for facial recognition. In: Applications of digital image processing VIII. International Society for Optics and Photonics, vol 575, pp 279–285
Nixon MS, Correia PL, Nasrollahi K, Moeslund TB, Hadid A, Tistarelli M (2015) On soft biometrics. Pattern Recognit Lett, Elsevier 68:218–230
Omidiora EO, Ojo O, Yekini NA, Tubi TA (2012) Analysis, design and implementation of human fingerprint patterns system. Towards age & gender determination, ridge thickness to valley thickness ratio (rtvtr) & ridge count on gender detection. Int J Adv Res Artif Intell 1(2):57–63
Park U, Jain AK (2010) Face matching and retrieval using soft biometrics. IEEE Trans Inf Forensics Secur 5(3):406–415
Pronobis M, Magimai-Doss M (2009) Analysis of F0 and cepstral features for robust automatic gender recognition. IDIAP Technical Report
Ramanathan V, Wechsler H (2010) Robust human authentication using appearance and holistic anthropometric features. Pattern Recogn Lett 31(15):2425–2435
Rattani A, Chen C, Ross A (2014) Evaluation of texture descriptors for automated gender estimation from fingerprints. In: European conference on computer vision. Springer, Cham, pp 764–777
Raval MS (2016) Digital Video Forensics: Description based person identification. CSI Commun 39(12):9–11
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 779–788
Reid DA, Nixon MS (2010) Imputing human descriptions in semantic biometrics. In: Proceedings of 2nd workshop on multimedia in forensics, security and intelligence, Firenze, Italy, 29, Oct 2010. ACM, pp 25–30
Reid DA, Nixon MS, Stevenage SV (2013) Soft biometrics; human identification using comparative descriptions. IEEE Trans Pattern Anal Mach Intell 36 (6):1216–1228
Reid DA, Samangooei S, Chen C, Nixon MS, Ross A (2013) Soft biometrics for surveillance: an overview. Handbook of statistics, vol 31. Elsevier, Amsterdam, pp 327–352
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems (NIPS), pp 91–99
Rhodes HTF (1956) Alphonse Bertillon: Father of scientific detection. Abelard-Schuman, New York
Ristani E, Solera F, Zou R, Cucchiara R, Tomasi C (2016) Performance measures and a data set for multi-target, multi-camera tracking. In: European conference on computer vision (ECCV). Springer, Cham, pp 17–35
Samangooei S, Guo B, Nixon MS (2008) The use of semantic human description as a soft biometric. In: Proceedings of 2nd IEEE international conference on biometrics: theory, applications, and systems, Arlington, USA, 29 Sept.-1, Oct 2008, pp 1–7
Samangooei S, Guo B, Nixon MS (2008) The use of semantic human description as a soft biometric. In: 2nd IEEE international conference on biometrics, theory, applications and systems, pp 1–7
Sarafianos N, Xu X, Kakadiaris IA (2019) Adversarial representation learning for text-to-image matching. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 5814–5824
Scheuer JL, Elkington NM (1993) Sex determination from metacarpals and the first proximal phalanx. J Forensic Sci 38(4):769–778
Schumann A, Specker A, Beyerer J (2018) Attribute-based person retrieval and search in video sequences. In: 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp 1–6
Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681
Shah P, Raval MS, Pandya S, Chaudhary S, Laddha A, Galiyawala H (2017) Description based person identification: use of clothes color and type. In: National conference on computer vision, pattern recognition, image processing, and graphics (NCVPRIPG). Springer, Singapore, pp 457–469
Shan C, Gong S, McOwan PW (2008) Fusing gait and face cues for human gender recognition. Neurocomputing 71(10-12):1931–1938
Sharma G, Wu W, Dalal EN (2005) The CIEDE2000 color-difference formula: Implementation notes, supplementary test data, and mathematical observations. Color Research & Application: Endorsed by Inter-Society Color Council, The Colour Group (Great Britain), Canadian Society for Color, Color Science Association of Japan, Dutch Society for the Study of Color, The Swedish Colour Centre Foundation, Colour Society of Australia, Centre Français de la Couleur, 30(1):21–30
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Sorokin VN, Makarov IS (2008) Gender recognition from vocal source. Acoust Phys 54(4):571–578
Sudowe P, Spitzer H, Leibe B (2015) Person attribute recognition with a jointly-trained holistic cnn model. In: Proceedings of the IEEE international conference on computer vision (ICCV) workshops, pp 87–95
Sun Z, Bebis G, Yuan X, Louis SJ (2002) Genetic feature subset selection for gender classification: A comparison study. In: Proceedings of 6th IEEE workshop on applications of computer vision (WACV), pp 165–170
Sun B, Saenko K (2016) Deep coral: Correlation alignment for deep domain adaptation. In: European conference on computer vision (ECCV). Springer, Cham, pp 443–450
Sun N, Zheng W, Sun C, Zou C, Zhao L (2006) Gender classification based on boosting local binary pattern. In: International symposium on neural networks, 194-201. Springer, Berlin
Sun Y, Zheng L, Yang Y, Tian Q, Wang S (2018) Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In: Proceedings of the European Conference on Computer Vision (ECCV), pp 480–496
Tan M, Le Q (2019) EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In: International Conference on Machine Learning (ICML), pp 6105–6114
Thirde D, Li L, Ferryman J (2006) An overview of the pets 2006 dataset. In: International workshop on performance evaluation of tracking and surveillance, pp 47–50
Thomas V, Chawla NV, Bowyer KW, Flynn PJ (2007) Learning to predict gender from iris images. In: 1st IEEE international conference on biometrics, theory, applications, and systems, pp 1–5
Tom RJ, Arulkumaran T, Scholar ME (2013) Fingerprint based gender classification using 2D discrete wavelet transforms and principal component analysis. Int J Eng Trends Technol 4(2):199–203
Tome P, Fierrez J, Vera-Rodriguez R, Nixon MS (2014) Soft biometrics and their application in person recognition at a distance. IEEE Trans Inf Forensics Secur 9(3):464–75
Tome P, Fierrez J, Vera-Rodriguez R, Nixon MS (2014) Soft biometrics and their application in person recognition at a distance. IEEE Trans Inf Forensics Secur 9(3):464–475
Tsai R (1987) A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses. IEEE J Robot Autom 3(4):323–344
Vaquero DA, Feris RS, Tran D, Brown L, Hampapur A, Turk M (2009) Attribute-based people search in surveillance environments. In: IEEE workshop on applications of computer vision (WACV), pp 1–8
Walawalkar L, Yeasin M, Narasimhamurthy AM, Sharma R (2002) Support vector learning for gender classification using audio and visual cues: A comparison. In: International workshop on support vector machines. Springer, Berlin, pp 144–159
Wang Y, Bo C, Wang D, Wang S, Qi Y, Lu H (2019) Language person search with mutually connected classification loss. In: IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2057–2061)
Wang T, Gong S, Zhu X, Wang S (2014) Person re-identification by video ranking. In: European conference on computer vision (ECCV). Springer, Cham, pp 688–703
Wang K, Yin Q, Wang W, Wu S, Wang L (2016) A comprehensive survey on cross-modal retrieval. arXiv:1607.06215
Wei L, Zhang S, Gao W, Tian Q (2018) Person transfer gan to bridge domain gap for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 79–88
Woodward JD, Orlans NM, Higgins PT (2003) Biometrics. The McGraw-Hill Companies, Inc, New York
Wu K, Childers DG (1991) Gender recognition from speech. Part I: Coarse analysis. J Acoust Soc Am 90(4):1828–1840
Wu Q, Dai P, Chen P, Huang Y (2019) Deep adversarial data augmentation with attribute guided for person re-identification. Signal Image Video Process 5:1–8
Yaguchi T, Nixon MS (2018) Transfer learning based approach for semantic person retrieval. In: 15th IEEE international conference on advanced video and signal based surveillance (AVSS), pp 1–6
Yamaguchi M, Saito K, Ushiku Y, Harada T (2017) Spatio-temporal person retrieval via natural language queries. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 1453–1462
Yin Z, Zheng WS, Wu A, Yu HX, Wan H, Guo X, Huang F, Lai J (2018) Adversarial attribute-image person re-identification. In: Proceedings of the 27th international joint conference on artificial intelligence, pp 1100–1106
Yoo JH, Hwang D, Nixon MS (2005) Gender classification in human gait using support vector machine. In: International conference on advanced concepts for intelligent vision systems. Springer, Berlin, pp 138–145
Yu S, Tan T, Huang K, Jia K, Wu X (2009) A study on gait-based gender classification. IEEE Trans Image Process 18(8):1905–1910
Zha ZJ, Liu J, Chen D, Wu F (2020) Adversarial attribute-text embedding for person search with natural language query. IEEE Trans Multimed 22 (7):1836–1846
Zhang Y, Lu H (2018) Deep cross-modal projection learning for image-text matching. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 686–701
Zhen L, Hu P, Wang X, Peng D (2019) Deep supervised cross-modal retrieval. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 10394–10403
Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: A benchmark. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 1116–1124
Zheng L, Zhang H, Sun S, Chandraker M, Yang Y, Tian Q (2017) Person re-identification in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1367–1376
Zheng Z, Zheng L, Garrett M, Yang Y, Shen YD (2017) Dual-path convolutional image-text embedding with instance loss. arXiv:1711.05535
Zhou T, Chen M, Yu J, Terzopoulos D (2017) Attention-based natural language person retrieval. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops (CVPR), pp 27–34
Zhu J, Liao S, Lei Z, Yi D, Li S (2013) Pedestrian attribute classification in surveillance: Database and evaluation. In: Proceedings of the IEEE international conference on computer vision (ICCV) workshops, pp 331–338
Zhu J, Liao S, Yi D, Lei Z, Li SZ (2015) Multi-label cnn based pedestrian attribute learning for soft biometrics. In: IEEE international conference on biometrics (ICB), pp 535–540
Acknowledgements
The Board of Research in Nuclear Sciences (BRNS), Government of India (36(3)/14/20/2016-BRNS/36020) supports this work. The authors acknowledge the support of NVIDIA Corporation for a donation of the Quadro K5200 GPU used for this research. The authors are thankful to Ahmedabad University, India, for access to resources like GPUs. We would also like to thank the vision and language domain’s active researchers for creating publicly available challenging datasets.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Galiyawala, H., Raval, M.S. Person retrieval in surveillance using textual query: a review. Multimed Tools Appl 80, 27343–27383 (2021). https://doi.org/10.1007/s11042-021-10983-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-10983-0