Assessing the Robustness of Cluster Solutions in Emotionally-Annotated Pictures Using Monte-Carlo Simulation Stabilized K-Means Algorithm
<p>The circumplex model of emotion as described in [<a href="#B17-make-03-00022" class="html-bibr">17</a>]. Valence (<span class="html-italic">val</span>) represents <span class="html-italic">x</span>-axis and arousal (<span class="html-italic">ar</span>) <span class="html-italic">y</span>-axis. Red points mark pictures from the experimental dataset listed in <a href="#make-03-00022-t001" class="html-table">Table 1</a>. Approximate (<span class="html-italic">val</span>, <span class="html-italic">ar</span>) coordinates of basic emotions in the dimensional emotion model space <span class="html-italic">Ω<sub>Emo</sub></span> are indicated.</p> "> Figure 2
<p>Example pictures from the NAPS dataset. Reproduced with permission from Marchewka, A.; Żurawski, Ł.; Jednorog, K.; Grabowska, A. The Nencki Affective Picture System (NAPS): Introduction to a novel, standardized, wide-range, high-quality, realistic picture database (2014), Springer.</p> "> Figure 3
<p>Examples of unstable distribution indexes and cluster order permutations for four possible distribution indexes.</p> "> Figure 4
<p>An example of the undecidability of the distribution. Black dots represent the unstable cluster affiliation of pictures in the feature space (valence, arousal).</p> "> Figure 5
<p>Estimation of variation by number of distributions using the elbow method.</p> "> Figure 6
<p>Estimation of variation by the number of distributions using the silhouette method.</p> "> Figure 7
<p>Overall stability error of the distribution method with respect to the number of simulation iterations.</p> "> Figure 8
<p>Overall stability error of the distribution method with respect to the number of clusters.</p> "> Figure A1
<p>UML class diagram showing the software tool’s five functional class modules (Analysis, Runner, InputData, Config, PlotAnnotator), their attributes, operations and mutual relationships.</p> "> Figure A2
<p>The clustering procedure using Monte-Carlo simulation stabilized k-means implemented in the Python software tool. UML activity diagrams illustrating functions StableColoredKMeans (<b>left</b>) and MonteCarloKMeans (<b>right</b>).</p> ">
Abstract
:1. Introduction
2. Affective Multimedia Databases
2.1. Models of Affect in Affective Multimedia Databases
2.2. The NAPS Affective Picture Database
3. Related Work
4. Unsupervised Machine Learning Methods
4.1. k-Means Algorithm
4.2. Disadvantages of the k-Means Algorithm and the Solutions Used
4.2.1. Unstable Cluster Indexes
4.2.2. Statistical Distribution Undecidability
4.3. Defining the Optimal Number of Clusters (Parameter k)
5. Experiment and Results
5.1. The Optimal Number of Clusters
5.2. Reliability of the Stable Distribution Method
- Calculate the histogram, i.e., the matrix of cluster affiliation (n x k) through s simulations.
- All elements of the matrix that are equal to s are reset to zero because these points are stable.
- For each row (example) in the matrix, count columns other than zero.
- Subtract 1 from each such row (one column is considered correct).
- The total error e is then the sum of all the rows from Step 4.
6. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
Appendix A
- (1)
- Analysis—the main program that runs the selected computation (snippet) and produces a graph or textual output. These outputs were directly used for analysis and are included in the paper as figures or tables.
- (2)
- Runner—a class with all the computation and plotting logic on the higher abstraction level, e.g., for computing stable argmax partitions, plotting stability error curves, and computing silhouette scores.Lib—implements the lower-level library functions and abstractions, contains the following classes:
- (3)
- InputData—abstraction for data input and output for the NAPS or other affective picture datasets with similar architectures;
- (4)
- Config—class for configuring the k-means algorithm and evaluation parameters, other methods, such as dataset partitioning;
- (5)
- PlotAnnotator—a class module that provides support for rendering interactive data plots in the tool’s graphical user interface.
References
- Omran, M.G.H.; Engelbrecht, A.P.; Salman, A. An overview of clustering methods. Intell. Data Anal. 2007, 11, 583–605. [Google Scholar] [CrossRef]
- Alelyani, S.; Tang, J.; Liu, H. Feature Selection for Clustering: A Review. In Data Clustering: Algorithms and Applications; Aggarwal, C., Reddy, C., Eds.; CRC Press: Boca Raton, FL, USA, 2013. [Google Scholar]
- de Amorim, R.C.; Hennig, C. Recovering the number of clusters in data sets with noise features using feature rescaling factors. Inf. Sci. 2015, 324, 126–145. [Google Scholar] [CrossRef] [Green Version]
- Calvo-Zaragoza, J.; Valero-Mas, J.J.; Rico-Juan, J.R. Prototype generation on structural data using dissimilarity space representation. Neural Comput. Appl. 2017, 28, 2415–2424. [Google Scholar] [CrossRef] [Green Version]
- Cios, K.J.; Swiniarski, R.W.; Pedrycz, W.; Kurgan, L.A. Unsupervised learning: Clustering. In Data Mining; Springer: Boston, MA, USA, 2007; pp. 257–288. [Google Scholar]
- Celebi, M.E.; Aydin, K. (Eds.) Unsupervised Learning Algorithms; Springer: Berlin, Germany, 2016. [Google Scholar]
- Kameshwaran, K.; Malarvizhi, K. Survey on clustering techniques in data mining. Int. J. Comput. Sci. Inf. Technol. 2014, 5, 2272–2276. [Google Scholar]
- Kanungo, T.; Mount, D.M.; Netanyahu, N.S.; Piatko, C.D.; Silverman, R.; Wu, A.Y. An efficient k-means clustering algorithm: Analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 881–892. [Google Scholar] [CrossRef]
- Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. InKdd 1996, 96, 226–231. [Google Scholar]
- Sinaga, K.P.; Yang, M.S. Unsupervised K-means clustering algorithm. IEEE Access 2020, 8, 80716–80727. [Google Scholar] [CrossRef]
- Horvat, M.; Popović, S.; Ćosić, K. Towards semantic and affective coupling in emotionally annotated databases. In Proceedings of the 35th International Convention on Information and Communication Technology, Electronics and Microelectronics MIPRO 2012, Opatija, Croatia, 21–25 May 2012; pp. 1003–1008. [Google Scholar]
- Colden, A.; Bruder, M.; Manstead, A.S. Human content in affect-inducing stimuli: A secondary analysis of the international affective picture system. Motiv. Emot. 2008, 32, 260–269. [Google Scholar] [CrossRef]
- Horvat, M. A Brief Overview of Affective Multimedia Databases. In Central European Conference on Information and Intelligent Systems; Faculty of Organization and Informatics: Varaždin, Croatia, 2017; pp. 3–9. [Google Scholar]
- Marchewka, A.; Żurawski, Ł.; Jednorog, K.; Grabowska, A. The Nencki Affective Picture System (NAPS): Introduction to a novel, standardized, wide-range, high-quality, realistic picture database. Behav. Res. Methods 2014, 46, 596–610. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Riegel, M.; Żurawski, Ł.; Wierzba, M.; Moslehi, A.; Klocek, Ł.; Horvat, M.; Grabowska, A.; Michałowski, J.; Marchewka, A. Characterization of the Nencki Affective Picture System by discrete emotional categories (NAPS BE). Behav. Res. Methods 2016, 48, 600–612. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Peter, C.; Herbon, A. Emotion representation and physiology assignments in digital systems. Interact. Comput. 2006, 18, 139–170. [Google Scholar] [CrossRef]
- Posner, J.; Russell, J.A.; Peterson, B.S. The circumplex model of affect: An integrative approach to affective neuroscience, cognitive development, and psychopathology. Dev. Psychopathol. 2005, 17, 715. [Google Scholar] [CrossRef] [PubMed]
- Lang, P.J.; Bradley, M.M.; Cuthbert, B.N. International Affective Picture System (IAPS): Affective Ratings of Pictures and Instruction Manual; Technical Report A-8; University of Florida: Gainesville, FL, USA, 2008. [Google Scholar]
- Wierzba, M.; Riegel, M.; Pucz, A.; Leśniewska, Z.; Dragan, W.Ł.; Gola, M.; Jednorog, K.; Marchewka, A. Erotic subset for the Nencki Affective Picture System (NAPS ERO): Cross-sexual comparison study. Front. Psychol. 2015, 6, 1336. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Kensinger, E.A.; Schacter, D.L. Processing emotional pictures and words: Effects of valence and arousal. Cogn. Affect. Behav. Neurosci. 2006, 6, 110–126. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Horvat, M.; Jednoróg, K.; Marchewka, A. Clustering of Affective Dimensions in Pictures: An exploratory analysis of the NAPS database. In Proceedings of the 39th International Convention on Information and Communication Technology, Electronics and Microelectronics MIPRO 2016, Opatija, Croatia, 30 May–3 June 2016; pp. 1496–1501. [Google Scholar]
- Horvat, M.; Popović, S.; Ćosić, K. Multimedia stimuli databases usage patterns: A survey report. In Proceedings of the 36th International Convention on Information and Communication Technology, Electronics and Microelectronics MIPRO 2013, Opatija, Croatia, 20–24 May 2013; pp. 993–997. [Google Scholar]
- Constantinescu, A.C.; Wolters, M.; Moore, A.; MacPherson, S.E. A cluster-based approach to selecting representative stimuli from the International Affective Picture System (IAPS) database. Behav. Res. Methods 2017, 49, 896–912. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Hamerly, G.; Drake, J. Accelerating Lloyd’s algorithm for k-means clustering. In Partitional Clustering Algorithms; Springer: Cham, Switzerland, 2015; pp. 41–78. [Google Scholar]
- Mahajan, M.; Nimbhorkar, P.; Varadarajan, K. The planar k-means problem is NP-hard. Theor. Comput. Sci. 2012, 442, 13–21. [Google Scholar] [CrossRef] [Green Version]
- Duda, R.O.; Hart, P.E.; Stork, D.G. Pattern Classification, 2nd ed.; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2000. [Google Scholar]
- Kroese, D.P.; Brereton, T.; Taimre, T.; Botev, Z.I. Why the Monte Carlo method is so important today. Wiley Interdiscip. Rev. Comput. Stat. 2014, 6, 386–392. [Google Scholar] [CrossRef]
- Cluster Validation Essentials. Available online: https://www.datanovia.com/en/lessons/determining-the-optimal-number-of-clusters-3-must-know-methods/ (accessed on 31 March 2021).
- Ketchen, D.J.; Shook, C.L. The application of cluster analysis in strategic management research: An analysis and critique. Strateg. Manag. J. 1996, 17, 441–458. [Google Scholar] [CrossRef]
- Rousseeuw, P.J. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 1987, 20, 53–65. [Google Scholar] [CrossRef] [Green Version]
ID | Description | Valence (Avg) | Arousal (Avg) |
---|---|---|---|
Animals_002_v | lion | 6.45 | 6.86 |
Animals_003_h | snake | 5.02 | 5.51 |
Animals_004_v | wolf | 4.54 | 7.10 |
Animals_005_h | bat | 5.57 | 5.73 |
Faces_001_h | children with a dog | 7.80 | 4.97 |
Faces_242_h | man and woman smiling | 6.66 | 3.76 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Horvat, M.; Jović, A.; Burnik, K. Assessing the Robustness of Cluster Solutions in Emotionally-Annotated Pictures Using Monte-Carlo Simulation Stabilized K-Means Algorithm. Mach. Learn. Knowl. Extr. 2021, 3, 435-452. https://doi.org/10.3390/make3020022
Horvat M, Jović A, Burnik K. Assessing the Robustness of Cluster Solutions in Emotionally-Annotated Pictures Using Monte-Carlo Simulation Stabilized K-Means Algorithm. Machine Learning and Knowledge Extraction. 2021; 3(2):435-452. https://doi.org/10.3390/make3020022
Chicago/Turabian StyleHorvat, Marko, Alan Jović, and Kristijan Burnik. 2021. "Assessing the Robustness of Cluster Solutions in Emotionally-Annotated Pictures Using Monte-Carlo Simulation Stabilized K-Means Algorithm" Machine Learning and Knowledge Extraction 3, no. 2: 435-452. https://doi.org/10.3390/make3020022
APA StyleHorvat, M., Jović, A., & Burnik, K. (2021). Assessing the Robustness of Cluster Solutions in Emotionally-Annotated Pictures Using Monte-Carlo Simulation Stabilized K-Means Algorithm. Machine Learning and Knowledge Extraction, 3(2), 435-452. https://doi.org/10.3390/make3020022