A Multi-Column Deep Framework for Recognizing Artistic Media
"> Figure 1
<p>The frequency of artistic media from <span class="html-italic">WikiArt</span>.</p> "> Figure 2
<p>Patches containing stroke textures for pencil and watercolor artwork (<b>a</b>) Input artwork image; (<b>b</b>) patches containing stroke textures; (<b>c</b>) input image resized to patch scale.</p> "> Figure 3
<p>Three datasets for our approach.</p> "> Figure 4
<p>The process of our approach.</p> "> Figure 5
<p>Our structure: (<b>a</b>) A single recognizing module and (<b>b</b>) an overall multi-column structure.</p> "> Figure 6
<p>The Gram matrices extracted from DenseNet-161.</p> "> Figure 7
<p>The process of decision making in a recognizing module.</p> "> Figure 8
<p>The confusion matrices from <span class="html-italic">WikiSet</span>, <span class="html-italic">YMSet</span> and <span class="html-italic">SynthSet</span>.</p> "> Figure 9
<p>Comparison of F1 scores from the four recognizer models on <span class="html-italic">YMSet</span> (<b>a</b>) and <span class="html-italic">SynthSet</span> (<b>b</b>).</p> "> Figure 10
<p>Comparison of F1 scores from three datasets on our recognizer.</p> "> Figure 11
<p>The decreasing order of recall values of the confusion matrices in <a href="#electronics-08-01277-f008" class="html-fig">Figure 8</a> for <span class="html-italic">YMSets</span> and <span class="html-italic">SynthSet</span>.</p> "> Figure 12
<p>The most confusing pairs and least confusing pairs between <span class="html-italic">YMSet</span> (<b>a</b>) and <span class="html-italic">SynthSet</span> (<b>b</b>).</p> "> Figure 13
<p>The process of evaluation guideline for synthesized artwork images. A technique that generates synthesized artwork images by mimicking artistic media is evaluated to be successful, if our recognizer can recognize the target media from their result images.</p> "> Figure A1
<p>Oil paint images in <span class="html-italic">SynthSet</span>.</p> "> Figure A2
<p>Pastel, watercolor and pencil images in <span class="html-italic">SynthSet</span>.</p> "> Figure A3
<p>Pencil images in <span class="html-italic">SynthSet</span>.</p> ">
Abstract
:1. Introduction
2. Related Work
2.1. Schemes Using Bandcrafted Features
2.2. Schemes Using CNN-Based Features
2.3. Patch-Based Schemes
3. Building Our Recognizer for Artistic Media
3.1. A Strategy for Our Recognizer
3.2. Structure of Our Recognizer
3.3. Estimating Gram Matrix for Our Recognizer
4. Implementation and Result
4.1. Implementation
4.2. Data Collection
4.3. Training and Results
5. Experiment and Analysis
5.1. Comparison
5.1.1. Comparison with the Existing Models
5.1.2. Comparison with the Datasets
5.2. Analysis
5.2.1. Why YMSet Shows Best Performance?
5.2.2. The Similarity of the Recognition Pattern for YMSet and SynthSet
- The recall value of each medium is the diagonal entry of the confusion matrix. The decreasing orders of recall values for YMSet and SynthSet are illustrated in Figure 11, which notifies that both of the orders match.
- The most confusing pair of a medium is the medium whose entry is the largest entry except for the diagonal entry in the row of the confusion matrix. For example, the most confusion pair of oil in YMSet is pastel, since the entry for pastel, which is 0.06, is the largest in the row for oil except the diagonal entry. The comparison of the most confusing pairs for each medium is illustrated in the left column of Figure 12. In Figure 12, the most confusing pairs for oil and pencil coincides for YMSet and SynthSet. The reason why most confusing pairs for pastel and watercolor does not coincide is discussed in Section 5.3.
- The least confusing pair of a medium is the medium whose entry is the smallest entry except for the diagonal entry in the row of the confusion matrix. For example, the least confusion pair of oil in YMSet is pencil, since the entry for pencil, which is 0.0, is the smallest in the row for oil except for the diagonal entry. The comparison of the most confusing pairs for each medium is illustrated in the right column of Figure 12. In Figure 12, the least confusing pair for oil coincides, and the least confusion pair for pencil is less than 0.01 for YMSet and SynthSet. The reason why the least confusing pairs for pastel and watercolor does not coincide is discussed in Section 5.3.
5.2.3. The Evaluation Guideline for Synthesized Artwork Images
5.3. Limitation
6. Conclusions and Future Work
Author Contributions
Funding
Conflicts of Interest
Appendix A
References
- Gatys, L.; Ecker, A.; Bethge, M. Image style transfer using convolutional neural networks. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 2414–2423. [Google Scholar]
- Keren, D. Painter identification using local reatures and naive bayes. In Proceedings of the International Conference on Pattern Recognition 2002, Quebec City, QC, Canada, 11–15 August 2002; pp. 474–477. [Google Scholar]
- Li, J.; Wang, J. Studying digital imagery of ancient paintings by mixtures of stochastic models. IEEE Trans. Image Process. 2004, 13, 340–353. [Google Scholar] [CrossRef] [PubMed]
- Lyu, S.; Rockmore, D.; Farid, H. A digital technique for art authentication. Proc. Natl. Acad. Sci. USA 2004, 101, 17006–17010. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Johnson, J.; Alahi, A.; FeiFei, L. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 694–711. [Google Scholar]
- Shamir, L.; Macura, T.; Orlov, N.; Eckley, D.; Goldberg, I. Impressionism, expressionism, surrealism: Automated recognition of painters and schools of art. ACM Trans. Appl. Percept. 2010, 7, 8. [Google Scholar] [CrossRef]
- Liu, G.; Yan, Y.; Ricci, E.; Yang, Y.; Han, Y.; Winkler, S.; Sebe, N. Inferring painting style with multi-task dictionary learning. In Proceedings of the International Conference on Artificial Intelligence, Buenos Aires, Argentina, 25–31 July 2015; pp. 2162–2168. [Google Scholar]
- Mensink, T.; van Gemert, J. The rijksmuseum challenge: Museum-centered visual recognition. In Proceedings of the ACM International Conference on Multimedia Retrieval, Glasgow, UK, 1–4 April 2014; p. 451. [Google Scholar]
- Karayev, S.; Trentacoste, M.; Han, H.; Agarwala, A.; Darrell, T.; Hertzmann, A.; Hertzmann, A.; Winnemoeller, H. Recognizing image style. In Proceedings of the British Machine Vision Conference, Nottingham, UK, 1–5 September 2014; pp. 1–20. [Google Scholar]
- Tan, W.; Chan, C.; Aguirre, H.; Tanaka, K. Ceci nést pas une pipe: A deep convolutional network for fine-art paintings classification. In Proceedings of the IEEE International Conference on Image Processing, Phoenix, AZ, USA, 25–28 September 2016; pp. 3703–3707. [Google Scholar]
- Strezoski, G.; Worring, M. Omniart: Multi-task deep learning for artistic data analysis. arXiv 2017, arXiv:1708.00684. [Google Scholar]
- Matsuo, S.; Yanai, K. Cnn-based style vector for style image retrieval. In Proceedings of the ACM International Conference on Multimedia Retrieval, New York, NY, USA, 6–9 June 2016; pp. 309–312. [Google Scholar]
- Chu, W.T.; Wu, Y.L. Deep correlation features for image style classification. In Proceedings of the ACM International Conference on Multimedia, Amsterdam, The Netherlands, 15–19 October 2016; pp. 402–406. [Google Scholar]
- Sun, T.; Wang, Y.; Yang, J.; Hu, X. Convolution neural networks with two pathways for image style recognition. IEEE Trans. Image Process. 2017, 26, 4102–4113. [Google Scholar] [CrossRef]
- Lu, X.; Lin, Z.; Shen, X.; Mech, R.; Wang, J.Z. Deep multipatch aggregation network for image style, aesthetics, and quality estimation. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 13–16 December 2015; pp. 990–998. [Google Scholar]
- Anwer, R.; Khan, F.; van de Weijer, J.; Laaksonen, J. Combining holistic and part-based deep representations for computational painting categorization. In Proceedings of the ACM International Conference on Multimedia Retrieval, New York, NY, USA, 6–9 June 2016; pp. 339–342. [Google Scholar]
- Peng, K.C.; Chen, T. A framework of extracting multi-scale features using multiple convolutional neural networks. In Proceedings of the International Conference on Multimedia and EXPO, Turin, Italy, 29 June–3 July 2015; pp. 1–6. [Google Scholar]
- Yang, H.; Min, K. Classification of basic artistic media based on a deep convolutional approach. Vis. Comput. 2019, 1–20. [Google Scholar] [CrossRef]
- Huang, G.; Liu, Z.; van der Maaten, L.; Weinberger, K. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Wilson, A.C.; Roelofs, R.; Stern, M.; Srebro, N.; Recht, B. The marginal value of adaptive gradient methods in machine learning. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 4148–4158. [Google Scholar]
- Litwinowicz, P. Processing images and video for an impressionist effect. In Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques, Los Angeles, CA, USA, 3–8 August 1997; pp. 407–414. [Google Scholar]
- Hertzmann, A. Painterly rendering with curved brush strokes of multiple sizes. In Proceedings of the 25th Annual Conference on Computer Graphics and Interactive Techniques, Orlando, FL, USA, 19–24 July 1998; pp. 453–460. [Google Scholar]
- Hays, J.; Essa, I. Image and video based painterly animation. In Proceedings of the Non-Photorealistic Rendering and Animation, Annecy, France, 7–9 June 2004; pp. 113–120. [Google Scholar]
- Kagaya, M.; Brendel, W.; Deng, Q.; Kesterson, T.; Todorovic, S.; Neill, P.; Neill, P.J.; Zhang, E. Video painting with space-time varying style parameters. IEEE Trans. Vis. Comput. Graph. 2011, 17, 74–87. [Google Scholar] [CrossRef]
- Zeng, K.; Zhao, M.; Xiong, C.; Zhu, S. From image parsing to painterly rendering. ACM Trans. Graph. 2009, 21, 2. [Google Scholar] [CrossRef]
- Lin, L.; Zeng, K.; Lv, H.; Wang, Y.; Xu, Y.; Zhu, S. Painterly animation using video semantics and feature correspondence. In Proceedings of the Non-Photorealistic Rendering and Animation, Annecy, France, 7–10 June 2010; pp. 73–80. [Google Scholar]
- Zhao, M.; Zhu, S. Sisley the abstract painter. In Proceedings of the Non-Photorealistic Rendering and Animation, Annecy, France, 7–10 June 2010; pp. 99–107. [Google Scholar]
- Zhao, M.; Zhu, S. Portrait painting using active templates. In Proceedings of the Non-Photorealistic Rendering and Animation, Vancouver, BC, Canada, 5–7 August 2011; pp. 117–124. [Google Scholar]
- O’Donovan, P.; Hertzmann, A. Anipaint: Interactive painterly animation from video. IEEE Trans. Vis. Comput. Graph. 2012, 18, 475–487. [Google Scholar] [CrossRef]
- Wu, Y.; Tsai, Y.; Lin, W.; Li, W. Generating pointillism paintings based on seurats color composition. Comput. Graph. Forum 2013, 32, 153–162. [Google Scholar] [CrossRef]
- Selim, A.; Mohamed, E.; Linda, D. Painting style transfer for head portraits using convolutional neural networks. ACM Trans. Graph. 2016, 35, 129. [Google Scholar] [CrossRef]
- Fiser, J.; Jamriska, O.; Simons, D.; Shechtman, E.; Lu, J.; Asente, P.; Lukáč, M.; Sýkora, D. Example-based synthesis of stylized facial animations. ACM Trans. Graph. 2017, 36, 155. [Google Scholar] [CrossRef]
- Rudolf, D.; Mould, D.; Neufeld, E. A bidirectional deposition model of wax crayons. Comput. Graph. Forum 2005, 24, 27–39. [Google Scholar] [CrossRef]
- Yang, H.; Min, K. A multi-layered framework for color pastel painting. KSII Trans. Internet Inf. Syst. 2017, 11, 3143–3165. [Google Scholar]
- Sousa, M.C.; Buchanan, J. Computer-generated graphite pencil rendering of 3d polygonal models. Comput. Graph. Forum 1999, 18, 195–208. [Google Scholar] [CrossRef]
- Sousa, M.C.; Buchanan, J. Observational model of blenders and erasers in computer-generated pencil rendering. Graph. Interface 1999, 99, 157–166. [Google Scholar]
- Takagi, S.; Nakajima, M.; Fujishiro, I. Volumetric modeling of colored pencil drawing. In Proceedings of the Seventh Pacific Conference on Computer Graphics and Applications (Cat. No.PR00293), Seoul, Korea, 7 October 1999; pp. 250–258. [Google Scholar]
- Lake, A.; Marshall, C.; Harris, M.; Blackstein, M. Stylized rendering techniques for scalable real-time 3d animation. In Proceedings of the 1st International Symposium on Non-Photorealistic Animation and Rendering, Annecy, France, 5–7 June 2000; pp. 13–20. [Google Scholar]
- Mao, X.; Nagasaka, Y.; Imamiya, A. Automatic generation of pencil drawing using lic. In Proceedings of the ACM Siggraph 2002 Conference Abstractions and Applications, San Antonio, TX, USA, 21–26 July 2002; p. 149. [Google Scholar]
- Yamamoto, S.; Mao, X.; Imamiya, A. Enhanced lic pencil filter. In Proceedings of the International Conference on Computer Graphics, Imaging and Visualization, Penang, Malaysia, 2 July 2004; pp. 251–256. [Google Scholar]
- Yamamoto, S.; Mao, X.; Imamiya, A. Colored pencil filter with custom colors. In Proceedings of the 12th Pacific Conference on Computer Graphics and Applications, Seoul, Korea, 6–8 October 2004; pp. 329–338. [Google Scholar]
- Matsui, H.; Johan, H.; Nishita, T. Creating colored pencil images by drawing strokes based on boundaries of regions. In Proceedings of the International 2005 Computer Graphics, Stony Brook, NY, USA, 22–24 June 2005; pp. 148–155. [Google Scholar]
- Lee, H.; Kwon, S.; Lee, S. Real-time pencil rendering. In Proceedings of the 4th International Symposium on Non-Photorealistic Animation and Rendering, Annecy, France, 5–7 June 2006; pp. 37–45. [Google Scholar]
- Xie, D.; Zhao, Y.; Xu, D.; Yang, X. Convolution filter based pencil drawing and its implementation on gpu. Lect. Notes Comput. Sci. 2007, 4847, 723–732. [Google Scholar]
- Xie, D.; Xuan, Y.; Zhang, Z. A colored pencil-drawing generating method based on interactive colorization. In Proceedings of the 2010 International Conference on Computing, Control and Industrial Engineering, Wuhan, China, 5–6 June 2010; pp. 166–169. [Google Scholar]
- Hata, M.; Toyoura, M.; Mao, X. Automatic generation of accentuated pencil drawing with saliency map and lic. Vis. Comput. 2012, 28, 657–668. [Google Scholar] [CrossRef]
- Lu, C.; Xu, L.; Jia, J. Combining sketch and tone for pencil drawing production. In Proceedings of the Symposium on Non-Photorealistic Animation and Rendering, Annecy, France, 4–6 June 2012; pp. 65–73. [Google Scholar]
- Yang, H.; Kwon, Y.; Min, K. A stylized approach for pencil drawing from photographs. Comput. Graph. Forum 2012, 31, 1471–1480. [Google Scholar] [CrossRef]
- Bousseau, A.; Kaplan, M.; Thollot, J.; Sillion, F. Interactive watercolor rendering with temporal coherence and abstraction. In Proceedings of the Non-Photorealistic Rendering and Animation, Annecy, France, 5–7 June 2006; pp. 141–149. [Google Scholar]
- Bousseau, A.; Neyret, F.; Thollot, J.; Salesin, D. Video watercolorization using bidirectional texture advection. ACM Trans. Graph. 2007, 26, 104. [Google Scholar] [CrossRef]
- Wang, M.; Wang, B.; Fei, Y.; Qian, K.; Chen, W.W.J.; Yong, J. Towards photo watercolorization with artistic verisimilitude. IEEE Trans. Vis. Comput. Graph. 2014, 20, 1451–1460. [Google Scholar] [CrossRef] [PubMed]
- Ng, A. Machine Learning Yearning. 2017. Available online: http://mlyearning.org (accessed on 1 October 2018).
Dataset | Media | No. | Acc. | Prec. | Rec. | F1 |
---|---|---|---|---|---|---|
WikiSet | Oil | 1000 | 0.85 | 0.68 | 0.85 | 0.75 |
Pastel | 1000 | 0.82 | 0.86 | 0.79 | 0.82 | |
Pencil | 755 | 0.84 | 0.9 | 0.77 | 0.83 | |
Watercolor | 1000 | 0.88 | 0.78 | 0.74 | 0.76 | |
Total | 3755 | 0.85 | 0.80 | 0.79 | 0.79 | |
YMSet | Oil | 914 | 0.94 | 0.86 | 0.86 | 0.86 |
Pastel | 1014 | 0.9 | 0.83 | 0.84 | 0.84 | |
Pencil | 1248 | 0.93 | 0.95 | 0.91 | 0.93 | |
Watercolor | 960 | 0.96 | 0.9 | 0.94 | 0.92 | |
Total | 4136 | 0.93 | 0.89 | 0.89 | 0.89 | |
SynthSet | Oil | 178 | 0.86 | 0.93 | 0.7 | 0.8 |
Pastel | 25 | 0.77 | 0.34 | 0.42 | 0.38 | |
Pencil | 183 | 0.83 | 0.84 | 0.91 | 0.88 | |
Watercolor | 35 | 0.92 | 0.51 | 0.77 | 0.61 | |
Total | 421 | 0.85 | 0.66 | 0.70 | 0.67 |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, H.; Min, K. A Multi-Column Deep Framework for Recognizing Artistic Media. Electronics 2019, 8, 1277. https://doi.org/10.3390/electronics8111277
Yang H, Min K. A Multi-Column Deep Framework for Recognizing Artistic Media. Electronics. 2019; 8(11):1277. https://doi.org/10.3390/electronics8111277
Chicago/Turabian StyleYang, Heekyung, and Kyungha Min. 2019. "A Multi-Column Deep Framework for Recognizing Artistic Media" Electronics 8, no. 11: 1277. https://doi.org/10.3390/electronics8111277
APA StyleYang, H., & Min, K. (2019). A Multi-Column Deep Framework for Recognizing Artistic Media. Electronics, 8(11), 1277. https://doi.org/10.3390/electronics8111277