[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Creating an AI fashioner through deep learning and computer vision

  • Original Paper
  • Published:
Evolving Systems Aims and scope Submit manuscript

Abstract

Fashion is a multibillion-dollar industry that concerns many people both socially and culturally. Thanks to social networks, there is a lot of data about the fashion industry on the internet. This has led researchers to shift their attention to this area, especially recently. This paper proposes an end-to-end framework to build an AI fashioner that can diagnose clothing compatibility and generate recommendations to improve compatibility. First, fashion compatibility reviews are analyzed, and incompatible clothing items are identified for each outfit combination. Next, the items of clothing that make up the outfit combination are separated using Mask R-CNN. Then, the incompatible clothing items were removed from the outfit combination, and the most similar outfit combinations were identified among the compatible clothing items. In addition, an attribute detection network was developed to extract the attributes of compatible outfits with the same category in the detected compatible outfits. Finally, recommendation sentences are generated using the detected attributes, and encoder-decoder models are used to train a deep network that generates recommendations from clothing images. Extensive experiments based on existing datasets demonstrate the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

Datasets are available in public repositories: The ModAI dataset can be accessed by anyone at https://doi.org/10.1016/j.eswa.2022.119305. The Polyvore-T are openly available at https://doi.org/10.1145/3343031.3350909.

References

  • Anderson P, He X, Buehler C, Teney D, Johnson M, Gould S, Zhang L (2017) Bottom-up and top-down attention for image captioning and visual question answering. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 6077–6086

  • Balim C, Özkan K (2021) Urün görsellerini kullanarak e-ticaret sistemleri için ürün başliği oluşturulmasi. Int J 3D Rint Technol Dig Ind 5:614–624. https://doi.org/10.46519/ij3dptdi.991789

    Article  Google Scholar 

  • Balim C, Özkan K (2023) Diagnosing fashion outfit compatibility with deep learning techniques. Expert Syst Appl 215:119305. https://doi.org/10.1016/j.eswa.2022.119305

    Article  Google Scholar 

  • Banerjee S, Lavie A (2005) METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pp 65–72

  • Chen L, He Y (2018) Dress fashionably: learn fashion collocation with deep mixed-category metric learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 32, no. 1

  • Chen X, Chen H, Xu H, Zhang Y, Cao Y, Qin Z, Zha H (2019a) Personalized fashion recommendation with visual explanations based on multimodal attention network: towards visually explainable recommendation. In: Proceedings of the 42nd International ACM SIGIR conference on research and development in information retrieval, pp 765–774. Association for Computing Machinery, New York. https://doi.org/10.1145/3331184.3331254

  • Chen W, Huang P, Xu J, Guo X, Guo C, Sun F, Li C, Pfadler A, Zhao H, Zhao B (2019b) POG: personalized outfit generation for fashion recommendation at alibaba iFashion. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp 2662–2670

  • Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv:1406.1078 [cs, stat]

  • FashionVLP (2022) Vision language transformer for fashion retrieval with feedback, https://www.amazon.science/publications/fashionvlp-vision-language-transformer-for-fashion-retrieval-with-feedback. Accessed 8 Aug 2022

  • Han X, Wu Z, Jiang Y-G, Davis LS (2017) Learning fashion compatibility with bidirectional LSTMs. In: MM 2017—proceedings of the 2017 ACM multimedia conference, pp 1078–1086. Doi: https://doi.org/10.1145/3123266.3123394

  • Han X (2022) Prototype-guided Attribute-wise Interpretable Scheme for Clothing Matching. In: Proceedings of the 42nd International ACM SIGIR conference on research and development in information retrieval. https://doi.org/10.1145/3331184.3331245. Accessed 7 Aug 2022

  • He R, Packer C, McAuley J (2016) Learning compatibility across categories for heterogeneous item recommendation. In: Proceedings—IEEE international conference on data mining, ICDM, pp 937–942

  • He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969

  • Herdade S, Kappeler A, Boakye K, Soares J (2019) Image captioning: transforming objects into words. Advances in neural information processing systems 32

  • Ji Y-H, Jun H, Kim I, Kim J, Kim Y, Ko B, Kook H-K, Lee J, Lee S, Park S (2020) An effective pipeline for a real-world clothes retrieval system. arXiv:2005.12739 [cs]

  • Kaicheng P, Xingxing Z, Wong WK (2021) modeling fashion compatibility with explanation by using bidirectional LSTM. In: 2021 IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW), pp 3889–3893. https://doi.org/10.1109/CVPRW53098.2021.00432

  • Kang Z, Pan H, Hoi SCH, Xu Z (2020a) Robust graph learning from noisy data. IEEE Trans Cybern 50:1833–1843. https://doi.org/10.1109/TCYB.2018.2887094

    Article  Google Scholar 

  • Kang Z, Lu X, Liang J, Bai K, Xu Z (2020b) Relation-guided representation learning. arXiv:2007.05742 [cs, stat]

  • Kavitha K, Kumar SL, Pravalika P, Sruthi K, Lalitha RVS, Rao NVK (2020) Fashion compatibility using convolutional neural networks. Mater Today: Proc. https://doi.org/10.1016/j.matpr.2020.09.365

    Article  Google Scholar 

  • Li Y, Cao L, Zhu J, Luo J (2016) Mining fashion outfit composition using an end-to-end deep learning approach on set data. IEEE Trans Multimedia 19:1946–1955. https://doi.org/10.1109/TMM.2017.2690144

    Article  Google Scholar 

  • Li X, Ye Z, Zhang Z, Zhao M (2021) Clothes image caption generation with attribute detection and visual attention model. Pattern Recogn Lett 141:68–74. https://doi.org/10.1016/j.patrec.2020.12.001

    Article  Google Scholar 

  • Li K, Liu C, Kumar R, Forsyth D (2019) Using discriminative methods to learn fashion compatibility across datasets. J Environ Sci (China) (English Ed)

  • Lin CY (2004) Rouge: a package for automatic evaluation of summaries. In: Text summarization branches out, pp 74–81

  • Lin Y, Ren P, Chen Z, Ren Z, Ma J, de Rijke M (2020) Explainable outfit recommendation with joint outfit matching and comment generation. IEEE Trans Knowl Data Eng 32:1502–1516. https://doi.org/10.1109/TKDE.2019.2906190

    Article  Google Scholar 

  • Liu Z, Luo P, Qiu S, Wang X, Tang X (2016) DeepFashion: powering robust clothes recognition and retrieval with rich annotations. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)

  • Lu S, Zhu X, Wu Y, Wan X, Gao F (2021) Outfit compatibility prediction with multi-layered feature fusion network. Pattern Recogn Lett 147:150–156. https://doi.org/10.1016/j.patrec.2021.04.009

    Article  Google Scholar 

  • McAuley J, Targett C, Shi Q, Hengel A (2015) van den: image-based recommendations on styles and substitutes. In: SIGIR 2015— Proceedings of the 38th International ACM SIGIR conference on research and development in information retrieval, pp 43–52

  • Mo D, Zou X, Wong W (2022) Neural stylist: towards online styling service. Expert Syst Appl 203:117333. https://doi.org/10.1016/j.eswa.2022.117333

    Article  Google Scholar 

  • Papineni K, Roukos S, Ward T, Zhu W-J (2001) BLEU: a method for automatic evaluation of machine translation. ACL 2011:311–318. https://doi.org/10.3115/1073083.1073135

    Article  Google Scholar 

  • Park YJ, Jo BC, Lee KU, Kim KS (2022) Improved transformer model for multimodal fashion recommendation conversation system. J Korea Contents Assoc 22:138–147. https://doi.org/10.5392/JKCA.2022.22.01.138

    Article  Google Scholar 

  • Qu W (2022) Visual and textual jointly enhanced interpretable fashion recommendation|IEEE Journals & Magazine|IEEE Xplore. https://ieeexplore.ieee.org/document/9046774. Accessed 7 Aug 2022.

  • Ren S, He K, Girshick R, Sun J (2016) Faster R-CNN: towards real-time object detection with region proposal networks. arXiv:1506.01497 [cs]

  • Sidnev A, Krapivin A, Trushkov A, Krasikova E, Kazakov M, Viryasov M (2021) DeepMark++: real-time clothing detection at the edge. In: Presented at the proceedings of the IEEE/CVF winter conference on applications of computer vision

  • Song X, Feng F, Liu J, Li Z, Nie L, Ma J (2017) NeuroStylist: neural compatibility modeling for clothing matching. In: Presented at the October 23. https://doi.org/10.1145/3123266.3123314

  • Sun GL, He JY, Wu X, Zhao B, Peng Q (2020a) Learning fashion compatibility across categories with deep multimodal neural networks. Neurocomputing 395:237–246. https://doi.org/10.1016/j.neucom.2018.06.098

    Article  Google Scholar 

  • Sun P, Wu L, Zhang K, Fu Y, Hong R, Wang M (2020b) Dual learning for explainable recommendation: towards unifying user preference prediction and review generation. In: Proceedings of the web conference 2020b, pp 837–847. Association for Computing Machinery, New York

  • Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. Advances in neural information processing systems 27

  • Tangseng P, Okatani T (2019) Toward explainable fashion recommendation. Arxiv. https://doi.org/10.48550/arXiv.1901.04870

    Article  Google Scholar 

  • Vasileva MI, Plummer BA, Dusad K, Rajpal S, Kumar R, Forsyth D (2018) Learning type-aware embeddings for fashion compatibility. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), 11220 LNCS, pp 405–421

  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5999–6009. Neural information processing systems foundation

  • Vedantam R, Zitnick CL, Parikh D (2014) CIDEr: consensus-based image description evaluation. In: Proceedings of the ieee computer society conference on computer vision and pattern recognition, pp 4566–4575

  • Veit A, Kovacs B, Bell S, McAuley J, Bala K, Belongie S (2015) Learning visual clothing style with heterogeneous dyadic co-occurrences. In: Proceedings of the IEEE international conference on computer vision, pp 4642–4650

  • Wang X, Wu B, Ye Y, Zhong Y (2019) Outfit compatibility prediction and diagnosis with multi-layered comparison network. In: MM 2019 —Proceedings of the 27th ACM international conference on multimedia, pp 329–337. https://doi.org/10.1145/3343031.3350909

  • Xu K, Ba JL, Kiros R, Cho K, Courville A, Salakhutdinov R, Zemel RS, Bengio Y (2015) Show, attend and tell: Neural image caption generation with visual attention. In: 32nd international conference on machine learning, ICML, pp 2048–2057. International Machine Learning Society (IMLS)

  • Yang X, Zhang H, Jin D, Liu Y, Wu C-H, Tan J, Xie D, Wang J, Wang X (2020) Fashion captioning: towards generating accurate descriptions with semantic rewards. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). 12358 LNCS, pp 1–17

  • Yang X, Song X, Feng F, Wen H, Duan L-Y, Nie L (2021) Attribute-wise Explainable Fashion Compatibility Modeling. ACM Trans Multimedia Comput Commun Appl 17:361–3621. https://doi.org/10.1145/3425636

    Article  Google Scholar 

  • Zhang H, Sun Y, Liu L, Wang X, Li L, Liu W (2020) ClothingOut: a category-supervised GAN model for clothing segmentation and retrieval. Neural Comput Appl 32:4519–4530. https://doi.org/10.1007/s00521-018-3691-y

    Article  Google Scholar 

  • Zheng S, Yang F, Kiapour M, Piramuthu R (2018) ModaNet: a large-scale street fashion dataset with polygon annotations. In: Presented at the October 15. https://doi.org/10.1145/3240508.3240652

Download references

Author information

Authors and Affiliations

Authors

Contributions

CB: data curation, visualization, ınvestigation, methodology, software, validation, writing—original draft. KÖ: supervision, conceptualization, ınvestigation, writing- reviewing and editing.

Corresponding author

Correspondence to Caner Balim.

Ethics declarations

Conflict of interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Balim, C., Ozkan, K. Creating an AI fashioner through deep learning and computer vision. Evolving Systems 15, 717–729 (2024). https://doi.org/10.1007/s12530-023-09498-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12530-023-09498-w

Keywords

Navigation