Abstract
Fashion is a multibillion-dollar industry that concerns many people both socially and culturally. Thanks to social networks, there is a lot of data about the fashion industry on the internet. This has led researchers to shift their attention to this area, especially recently. This paper proposes an end-to-end framework to build an AI fashioner that can diagnose clothing compatibility and generate recommendations to improve compatibility. First, fashion compatibility reviews are analyzed, and incompatible clothing items are identified for each outfit combination. Next, the items of clothing that make up the outfit combination are separated using Mask R-CNN. Then, the incompatible clothing items were removed from the outfit combination, and the most similar outfit combinations were identified among the compatible clothing items. In addition, an attribute detection network was developed to extract the attributes of compatible outfits with the same category in the detected compatible outfits. Finally, recommendation sentences are generated using the detected attributes, and encoder-decoder models are used to train a deep network that generates recommendations from clothing images. Extensive experiments based on existing datasets demonstrate the effectiveness of the proposed method.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
Datasets are available in public repositories: The ModAI dataset can be accessed by anyone at https://doi.org/10.1016/j.eswa.2022.119305. The Polyvore-T are openly available at https://doi.org/10.1145/3343031.3350909.
References
Anderson P, He X, Buehler C, Teney D, Johnson M, Gould S, Zhang L (2017) Bottom-up and top-down attention for image captioning and visual question answering. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 6077–6086
Balim C, Özkan K (2021) Urün görsellerini kullanarak e-ticaret sistemleri için ürün başliği oluşturulmasi. Int J 3D Rint Technol Dig Ind 5:614–624. https://doi.org/10.46519/ij3dptdi.991789
Balim C, Özkan K (2023) Diagnosing fashion outfit compatibility with deep learning techniques. Expert Syst Appl 215:119305. https://doi.org/10.1016/j.eswa.2022.119305
Banerjee S, Lavie A (2005) METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pp 65–72
Chen L, He Y (2018) Dress fashionably: learn fashion collocation with deep mixed-category metric learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 32, no. 1
Chen X, Chen H, Xu H, Zhang Y, Cao Y, Qin Z, Zha H (2019a) Personalized fashion recommendation with visual explanations based on multimodal attention network: towards visually explainable recommendation. In: Proceedings of the 42nd International ACM SIGIR conference on research and development in information retrieval, pp 765–774. Association for Computing Machinery, New York. https://doi.org/10.1145/3331184.3331254
Chen W, Huang P, Xu J, Guo X, Guo C, Sun F, Li C, Pfadler A, Zhao H, Zhao B (2019b) POG: personalized outfit generation for fashion recommendation at alibaba iFashion. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp 2662–2670
Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv:1406.1078 [cs, stat]
FashionVLP (2022) Vision language transformer for fashion retrieval with feedback, https://www.amazon.science/publications/fashionvlp-vision-language-transformer-for-fashion-retrieval-with-feedback. Accessed 8 Aug 2022
Han X, Wu Z, Jiang Y-G, Davis LS (2017) Learning fashion compatibility with bidirectional LSTMs. In: MM 2017—proceedings of the 2017 ACM multimedia conference, pp 1078–1086. Doi: https://doi.org/10.1145/3123266.3123394
Han X (2022) Prototype-guided Attribute-wise Interpretable Scheme for Clothing Matching. In: Proceedings of the 42nd International ACM SIGIR conference on research and development in information retrieval. https://doi.org/10.1145/3331184.3331245. Accessed 7 Aug 2022
He R, Packer C, McAuley J (2016) Learning compatibility across categories for heterogeneous item recommendation. In: Proceedings—IEEE international conference on data mining, ICDM, pp 937–942
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969
Herdade S, Kappeler A, Boakye K, Soares J (2019) Image captioning: transforming objects into words. Advances in neural information processing systems 32
Ji Y-H, Jun H, Kim I, Kim J, Kim Y, Ko B, Kook H-K, Lee J, Lee S, Park S (2020) An effective pipeline for a real-world clothes retrieval system. arXiv:2005.12739 [cs]
Kaicheng P, Xingxing Z, Wong WK (2021) modeling fashion compatibility with explanation by using bidirectional LSTM. In: 2021 IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW), pp 3889–3893. https://doi.org/10.1109/CVPRW53098.2021.00432
Kang Z, Pan H, Hoi SCH, Xu Z (2020a) Robust graph learning from noisy data. IEEE Trans Cybern 50:1833–1843. https://doi.org/10.1109/TCYB.2018.2887094
Kang Z, Lu X, Liang J, Bai K, Xu Z (2020b) Relation-guided representation learning. arXiv:2007.05742 [cs, stat]
Kavitha K, Kumar SL, Pravalika P, Sruthi K, Lalitha RVS, Rao NVK (2020) Fashion compatibility using convolutional neural networks. Mater Today: Proc. https://doi.org/10.1016/j.matpr.2020.09.365
Li Y, Cao L, Zhu J, Luo J (2016) Mining fashion outfit composition using an end-to-end deep learning approach on set data. IEEE Trans Multimedia 19:1946–1955. https://doi.org/10.1109/TMM.2017.2690144
Li X, Ye Z, Zhang Z, Zhao M (2021) Clothes image caption generation with attribute detection and visual attention model. Pattern Recogn Lett 141:68–74. https://doi.org/10.1016/j.patrec.2020.12.001
Li K, Liu C, Kumar R, Forsyth D (2019) Using discriminative methods to learn fashion compatibility across datasets. J Environ Sci (China) (English Ed)
Lin CY (2004) Rouge: a package for automatic evaluation of summaries. In: Text summarization branches out, pp 74–81
Lin Y, Ren P, Chen Z, Ren Z, Ma J, de Rijke M (2020) Explainable outfit recommendation with joint outfit matching and comment generation. IEEE Trans Knowl Data Eng 32:1502–1516. https://doi.org/10.1109/TKDE.2019.2906190
Liu Z, Luo P, Qiu S, Wang X, Tang X (2016) DeepFashion: powering robust clothes recognition and retrieval with rich annotations. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)
Lu S, Zhu X, Wu Y, Wan X, Gao F (2021) Outfit compatibility prediction with multi-layered feature fusion network. Pattern Recogn Lett 147:150–156. https://doi.org/10.1016/j.patrec.2021.04.009
McAuley J, Targett C, Shi Q, Hengel A (2015) van den: image-based recommendations on styles and substitutes. In: SIGIR 2015— Proceedings of the 38th International ACM SIGIR conference on research and development in information retrieval, pp 43–52
Mo D, Zou X, Wong W (2022) Neural stylist: towards online styling service. Expert Syst Appl 203:117333. https://doi.org/10.1016/j.eswa.2022.117333
Papineni K, Roukos S, Ward T, Zhu W-J (2001) BLEU: a method for automatic evaluation of machine translation. ACL 2011:311–318. https://doi.org/10.3115/1073083.1073135
Park YJ, Jo BC, Lee KU, Kim KS (2022) Improved transformer model for multimodal fashion recommendation conversation system. J Korea Contents Assoc 22:138–147. https://doi.org/10.5392/JKCA.2022.22.01.138
Qu W (2022) Visual and textual jointly enhanced interpretable fashion recommendation|IEEE Journals & Magazine|IEEE Xplore. https://ieeexplore.ieee.org/document/9046774. Accessed 7 Aug 2022.
Ren S, He K, Girshick R, Sun J (2016) Faster R-CNN: towards real-time object detection with region proposal networks. arXiv:1506.01497 [cs]
Sidnev A, Krapivin A, Trushkov A, Krasikova E, Kazakov M, Viryasov M (2021) DeepMark++: real-time clothing detection at the edge. In: Presented at the proceedings of the IEEE/CVF winter conference on applications of computer vision
Song X, Feng F, Liu J, Li Z, Nie L, Ma J (2017) NeuroStylist: neural compatibility modeling for clothing matching. In: Presented at the October 23. https://doi.org/10.1145/3123266.3123314
Sun GL, He JY, Wu X, Zhao B, Peng Q (2020a) Learning fashion compatibility across categories with deep multimodal neural networks. Neurocomputing 395:237–246. https://doi.org/10.1016/j.neucom.2018.06.098
Sun P, Wu L, Zhang K, Fu Y, Hong R, Wang M (2020b) Dual learning for explainable recommendation: towards unifying user preference prediction and review generation. In: Proceedings of the web conference 2020b, pp 837–847. Association for Computing Machinery, New York
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. Advances in neural information processing systems 27
Tangseng P, Okatani T (2019) Toward explainable fashion recommendation. Arxiv. https://doi.org/10.48550/arXiv.1901.04870
Vasileva MI, Plummer BA, Dusad K, Rajpal S, Kumar R, Forsyth D (2018) Learning type-aware embeddings for fashion compatibility. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), 11220 LNCS, pp 405–421
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5999–6009. Neural information processing systems foundation
Vedantam R, Zitnick CL, Parikh D (2014) CIDEr: consensus-based image description evaluation. In: Proceedings of the ieee computer society conference on computer vision and pattern recognition, pp 4566–4575
Veit A, Kovacs B, Bell S, McAuley J, Bala K, Belongie S (2015) Learning visual clothing style with heterogeneous dyadic co-occurrences. In: Proceedings of the IEEE international conference on computer vision, pp 4642–4650
Wang X, Wu B, Ye Y, Zhong Y (2019) Outfit compatibility prediction and diagnosis with multi-layered comparison network. In: MM 2019 —Proceedings of the 27th ACM international conference on multimedia, pp 329–337. https://doi.org/10.1145/3343031.3350909
Xu K, Ba JL, Kiros R, Cho K, Courville A, Salakhutdinov R, Zemel RS, Bengio Y (2015) Show, attend and tell: Neural image caption generation with visual attention. In: 32nd international conference on machine learning, ICML, pp 2048–2057. International Machine Learning Society (IMLS)
Yang X, Zhang H, Jin D, Liu Y, Wu C-H, Tan J, Xie D, Wang J, Wang X (2020) Fashion captioning: towards generating accurate descriptions with semantic rewards. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). 12358 LNCS, pp 1–17
Yang X, Song X, Feng F, Wen H, Duan L-Y, Nie L (2021) Attribute-wise Explainable Fashion Compatibility Modeling. ACM Trans Multimedia Comput Commun Appl 17:361–3621. https://doi.org/10.1145/3425636
Zhang H, Sun Y, Liu L, Wang X, Li L, Liu W (2020) ClothingOut: a category-supervised GAN model for clothing segmentation and retrieval. Neural Comput Appl 32:4519–4530. https://doi.org/10.1007/s00521-018-3691-y
Zheng S, Yang F, Kiapour M, Piramuthu R (2018) ModaNet: a large-scale street fashion dataset with polygon annotations. In: Presented at the October 15. https://doi.org/10.1145/3240508.3240652
Author information
Authors and Affiliations
Contributions
CB: data curation, visualization, ınvestigation, methodology, software, validation, writing—original draft. KÖ: supervision, conceptualization, ınvestigation, writing- reviewing and editing.
Corresponding author
Ethics declarations
Conflict of interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Balim, C., Ozkan, K. Creating an AI fashioner through deep learning and computer vision. Evolving Systems 15, 717–729 (2024). https://doi.org/10.1007/s12530-023-09498-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12530-023-09498-w