Disentangling Multiannual Air Quality Profiles Aided by Self-Organizing Map and Positive Matrix Factorization
"> Figure 1
<p>Scheme of data analysis method.</p> "> Figure 2
<p>Distribution of the modeled variables on the SOM. The distribution of the single pollutants (Ben, NO, NO<sub>2</sub>, Tol, PM<sub>10</sub>) on each node is depicted in grayscale, from white (lower concentration values) to black (higher concentration values). In the distance map, the distance between a node and its neighbors is depicted with a scale from green to white: the higher the distance, the greater the prevalence of white shading on the scale.</p> "> Figure 3
<p>Clustered two-way HCA map. Each row represents a node, while each column represents the values of the modeled variables retaining the autoscaling operated before SOM analysis; thus, the color scale represents low (dark red) to high (dark blue) values. The six clusters obtained are depicted by rectangles and the assigned cluster number is indicated on the right-hand side of the figure.</p> "> Figure 4
<p>(<b>a</b>) Division of SOM nodes into 6 clusters as obtained by HCA; (<b>b</b>) representation of the cluster centroid values by radar plots; (<b>c</b>) distribution of the modeled values for each cluster, as defined by SOM. For this figure, we used the same cluster color code as the one used in <a href="#toxics-13-00137-f003" class="html-fig">Figure 3</a>.</p> "> Figure 5
<p>Barplots representing the daily percentage distribution of clusters for site A1. From the top to the bottom of the figure: years from 2018 to 2023. For this figure, we have used the same cluster color code as the one in <a href="#toxics-13-00137-f004" class="html-fig">Figure 4</a>.</p> "> Figure 6
<p>On the left: Variability in the % contribution of each species to the respective PMF factor (sum of factors = 100%). The base run is shown as a blue box for reference. On the right: the nodes that made greater contributions to a factor are represented in black, with a greater amount of black shading indicating a more substantial contribution.</p> ">
Abstract
:1. Introduction
2. Materials and Methods
2.1. Dataset
2.2. Data Analysis Method
3. Results and Discussion
3.1. Data Cleaning
3.2. SOM Analysis
3.3. Hierarchical Clustering
3.4. Positive-Matrix Factorization
4. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- de Vries, W.; Posch, M.; Simpson, D.; de Leeuw, F.A.A.M.; van Grinsven, H.J.M.; Schulte-Uebbing, L.F.; Sutton, M.A.; Ros, G.H. Trends and Geographic Variation in Adverse Impacts of Nitrogen Use in Europe on Human Health, Climate, and Ecosystems: A Review. Earth Sci. Rev. 2024, 253, 104789. [Google Scholar] [CrossRef]
- Mahakalkar, A.U.; Gianquintieri, L.; Amici, L.; Brovelli, M.A.; Caiani, E.G. Geospatial Analysis of Short-Term Exposure to Air Pollution and Risk of Cardiovascular Diseases and Mortality—A Systematic Review. Chemosphere 2024, 353, 141495. [Google Scholar] [CrossRef] [PubMed]
- Markozannes, G.; Pantavou, K.; Rizos, E.C.; Sindosi, O.; Tagkas, C.; Seyfried, M.; Saldanha, I.J.; Hatzianastassiou, N.; Nikolopoulos, G.K.; Ntzani, E. Outdoor Air Quality and Human Health: An Overview of Reviews of Observational Studies. Environ. Pollut. 2022, 306, 119309. [Google Scholar] [CrossRef]
- Sicard, P.; Agathokleous, E.; Anenberg, S.C.; De Marco, A.; Paoletti, E.; Calatayud, V. Trends in Urban Air Pollution over the Last Two Decades: A Global Perspective. Sci. Total Environ. 2023, 858, 160064. [Google Scholar] [CrossRef]
- Tahir Bahadur, F.; Rasool Shah, S.; Rao Nidamanuri, R. Air Pollution Monitoring, and Modelling: An Overview. Environ. Forensics 2024, 25, 309–336. [Google Scholar] [CrossRef]
- Havemann, S.; Kishcha, P.; Agbehadji, I.E.; Obagbuwa, I.C. Systematic Review of Machine Learning and Deep Learning Techniques for Spatiotemporal Air Quality Prediction. Atmosphere 2024, 15, 1352. [Google Scholar] [CrossRef]
- Alvarez-Guerra, E.; Molina, A.; Viguri, J.R.; Alvarez-Guerra, M. A SOM-Based Methodology for Classifying Air Quality Monitoring Stations. Environ. Prog. Sustain. Energy 2011, 30, 424–438. [Google Scholar] [CrossRef]
- de Oliveira, R.H.; Carneiro, C.C.; de Almeida, F.G.V.; de Oliveira, B.M.; Nunes, E.H.M.; dos Santos, A.S. Multivariate Air Pollution Classification in Urban Areas Using Mobile Sensors and Self-Organizing Maps. Int. J. Environ. Sci. Technol. 2019, 16, 5475–5488. [Google Scholar] [CrossRef]
- Licen, S.; Cozzutto, S.; Barbieri, G.; Crosera, M.; Adami, G.; Barbieri, P. Characterization of Variability of Air Particulate Matter Size Profiles Recorded by Optical Particle Counters near a Complex Emissive Source by Use of Self-Organizing Map Algorithm. Chemom. Intell. Lab. Syst. 2019, 190, 48–54. [Google Scholar] [CrossRef]
- Costa, E.L.R.; Braga, T.; Dias, L.A.; de Albuquerque, É.L.; Fernandes, M.A.C. Self-Organizing Maps Applied to the Analysis and Identification of Characteristics Related to Air Quality Monitoring Stations and Its Pollutants. Neural Comput. Appl. 2024, 36, 11643–11657. [Google Scholar] [CrossRef]
- Song, X.H.; Hopke, P.K. Kohonen Neural Network as a Pattern Recognition Method Based on the Weight Interpretation. Anal. Chim. Acta 1996, 334, 57–66. [Google Scholar] [CrossRef]
- Kohonen, T. Self-Organizing Maps Springer Series in Information Sciences; Springer: Berlin/Heidelberg, Germany, 2001. [Google Scholar]
- Kohonen, T. Essentials of the Self-Organizing Map. Neural Netw. 2013, 37, 52–65. [Google Scholar] [CrossRef] [PubMed]
- Hopke, P.K. Review of Receptor Modeling Methods for Source Apportionment. J. Air Waste Manag. Assoc. 2016, 66, 237–259. [Google Scholar] [CrossRef] [PubMed]
- Zhou, L.; Hopke, P.K.; Paatero, P.; Ondov, J.M.; Pancras, J.P.; Pekney, N.J.; Davidson, C.I. Advanced Factor Analysis for Multiple Time Resolution Aerosol Composition Data. Atmos. Environ. 2004, 38, 4909–4920. [Google Scholar] [CrossRef]
- Paatero, P.; Tapper, U. Positive Matrix Factorization: A Non-Negative Factor Model with Optimal Utilization of Error Estimates of Data Values. Environmetrics 1994, 5, 111–126. [Google Scholar] [CrossRef]
- Fan, W.; Zhou, J.; Zheng, J.; Guo, Y.; Hu, L.; Shan, R. Hydrochemical Characteristics, Control Factors and Health Risk Assessment of Groundwater in Typical Arid Region Hotan Area, Chinese Xinjiang. Environ. Pollut. 2024, 363, 125301. [Google Scholar] [CrossRef]
- Zeng, J.; Liu, K.; Liu, X.; Tang, Z.; Wang, X.; Fu, R.; Lin, X.; Liu, N.; Qiu, J. Driving Factor, Source Identification, and Health Risk of PFAS Contamination in Groundwater Based on the Self-Organizing Map. Water Res. 2024, 267, 122458. [Google Scholar] [CrossRef]
- Trajković, I.; Sentić, M.; Vesković, J.; Lučić, M.; Miletić, A.; Onjia, A. Source-Oriented Health Risks and Distribution of BTEXS in Urban Shallow Lake Sediment: Application of the Positive Matrix Factorization Model. Water 2024, 16, 2302. [Google Scholar] [CrossRef]
- Zhang, Y.; Zhang, Q.; Chen, W.; Shi, W.; Cui, Y.; Chen, L.; Shao, J. Source Apportionment and Migration Characteristics of Heavy Metal(Loid)s in Soil and Groundwater of Contaminated Site. Environ. Pollut. 2023, 338, 122584. [Google Scholar] [CrossRef]
- Hassan, M.S.; Bhuiyan, M.A.H.; Rahman, M.T. Sources, Pattern, and Possible Health Impacts of PM2.5 in the Central Region of Bangladesh Using PMF, SOM, and Machine Learning Techniques. Case Stud. Chem. Environ. Eng. 2023, 8, 100366. [Google Scholar] [CrossRef]
- Liu, H.; Wang, Q.; Liu, S.; Zhou, B.; Qu, Y.; Tian, J.; Zhang, T.; Han, Y.; Cao, J. The Impact of Atmospheric Motions on Source-Specific Black Carbon and the Induced Direct Radiative Effects over a River-Valley Region. Atmos. Chem. Phys. 2022, 22, 11739–11757. [Google Scholar] [CrossRef]
- Kumar, S. Insights on Air Pollution During COVID-19: A Review. Aerosol Sci. Eng. 2023, 7, 192–206. [Google Scholar] [CrossRef]
- Sokhi, R.S.; Singh, V.; Querol, X.; Finardi, S.; Targino, A.C.; Andrade, M.d.F.; Pavlovic, R.; Garland, R.M.; Massagué, J.; Kong, S.; et al. A Global Observational Analysis to Understand Changes in Air Quality during Exceptionally Low Anthropogenic Emission Conditions. Environ. Int. 2021, 157, 106818. [Google Scholar] [CrossRef]
- Bar, S.; Parida, B.R.; Mandal, S.P.; Pandey, A.C.; Kumar, N.; Mishra, B. Impacts of Partial to Complete COVID-19 Lockdown on NO2 and PM2.5 Levels in Major Urban Cities of Europe and USA. Cities 2021, 117, 103308. [Google Scholar] [CrossRef]
- Vesanto, J. SOM-Based Data Visualization Methods. Intell. Data Anal. 1999, 3, 111–126. [Google Scholar] [CrossRef]
- Himberg, J.; Ahola, J.; Alhoniemi, E.; Vesanto, J.; Simula, O. The Self-Organizing Map as a Tool in Knowledge Engineering; World Scientific Publishing: Singapore, 2001; pp. 38–65. [Google Scholar]
- Licen, S.; Astel, A.; Tsakovski, S. Self-Organizing Map Algorithm for Assessing Spatial and Temporal Patterns of Pollutants in Environmental Compartments: A Review. Sci. Total Environ. 2023, 878, 163084. [Google Scholar] [CrossRef]
- Clark, S.; Sisson, S.A.; Sharma, A. Tools for Enhancing the Application of Self-Organizing Maps in Water Resources Research and Engineering. Adv. Water Resour. 2020, 143, 103676. [Google Scholar] [CrossRef]
- Vesanto, J.; Alhoniemi, E. Clustering of the Self-Organizing Map. IEEE Trans. Neural Netw. 2000, 11, 586–600. [Google Scholar] [CrossRef]
- Paatero, P. Least Squares Formulation of Robust Non-Negative Factor Analysis. Chemom. Intell. Lab. Syst. 1997, 37, 23–35. [Google Scholar] [CrossRef]
- Licen, S.; Franzon, M.; Rodani, T.; Barbieri, P. SOMEnv: An R Package for Mining Environmental Monitoring Datasets by Self-Organizing Map and k-Means Algorithms with a Graphical User Interface. Microchem. J. 2021, 165, 106181. [Google Scholar] [CrossRef]
- Melssen, W.; Wehrens, R.; Buydens, L. Supervised Kohonen Networks for Classification Problems. Chemom. Intell. Lab. Syst. 2006, 83, 99–113. [Google Scholar] [CrossRef]
- Wehrens, R.; Kruisselbrink, J. Flexible Self-Organizing Maps in Kohonen 3.0. J. Stat. Softw. 2018, 87, 1–18. [Google Scholar] [CrossRef]
- Carslaw, D.C.; Ropkins, K. Openair—An r Package for Air Quality Data Analysis. Environ. Model. Softw. 2012, 27–28, 52–61. [Google Scholar] [CrossRef]
- Kucheryavskiy, S. Mdatools—R Package for Chemometrics. Chemom. Intell. Lab. Syst. 2020, 198, 103937. [Google Scholar] [CrossRef]
- Kolde, R. Package “Pheatmap”: Pretty Heatmaps. R. package; GitHub, Inc.: San Francisco, CA, USA, 2022; pp. 1–8. [Google Scholar]
- Davies, D.L.; Bouldin, D.W. A Cluster Separation Measure. IEEE Trans. Pattern Anal. Mach. Intell. 1979, PAMI-1, 224–227. [Google Scholar] [CrossRef]
- Todeschini, R.; Ballabio, D.; Termopoli, V.; Consonni, V. Extended Multivariate Comparison of 68 Cluster Validity Indices. A Review. Chemom. Intell. Lab. Syst. 2024, 251, 105117. [Google Scholar] [CrossRef]
- Licen, S.; Tolloi, A.; Briguglio, S.; Piazzalunga, A.; Adami, G.; Barbieri, P. Small Scale Spatial Gradients of Outdoor and Indoor Benzene in Proximity of an Integrated Steel Plant. Sci. Total Environ. 2016, 553, 524–531. [Google Scholar] [CrossRef]
- Astel, A.M.; Giorgini, L.; Mistaro, A.; Pellegrini, I.; Cozzutto, S.; Barbieri, P. Urban BTEX Spatiotemporal Exposure Assessment by Chemometric Expertise. Water Air Soil Pollut. 2013, 224, 1503. [Google Scholar] [CrossRef]
- Kiihamäki, S.P.; Korhonen, M.; Kukkonen, J.; Shiue, I.; Jaakkola, J.J.K. Effects of Ambient Air Pollution from Shipping on Mortality: A Systematic Review. Sci. Total Environ. 2024, 945, 173714. [Google Scholar] [CrossRef]
- Stewart, G.B.; Dajnak, D.; Davison, J.; Carslaw, D.C.; Beddows, A.V.; Phantawesak, N.; Stettler, M.E.J.; Hollaway, M.J.; Beevers, S.D. New NOx and NO2 Vehicle Emission Curves, and Their Implications for Emissions Inventories and Air Pollution Modelling. Urban Clim. 2024, 57, 102103. [Google Scholar] [CrossRef]
- Ghermandi, G.; Fabbi, S.; Veratti, G.; Bigi, A.; Teggi, S. Estimate of Secondary NO2 Levels at Two Urban Traffic Sites Using Observations and Modelling. Sustainability 2020, 12, 7897. [Google Scholar] [CrossRef]
- Muñoz, A.; Muruzábal, J. Self-Organizing Maps for Outlier Detection. Neurocomputing 1998, 18, 33–60. [Google Scholar] [CrossRef]
- Muruzábal, J.; Muñoz, A. On the Visualization of Outliers via Self-Organizing Maps. J. Comput. Graph. Stat. 1997, 6, 355–382. [Google Scholar] [CrossRef]
- Mifka, B.; Telišman Prtenjak, M.; Kavre Piltaver, I.; Mekterović, D.; Kuzmić, J.; Marciuš, M.; Ciglenečki, I. Intense Desert Dust Event in the Northern Adriatic (March 2020); Insights From the Numerical Model Application and Chemical Characterization Results. Earth Space Sci. 2023, 10, e2023EA002879. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fornasaro, S.; Astel, A.; Barbieri, P.; Licen, S. Disentangling Multiannual Air Quality Profiles Aided by Self-Organizing Map and Positive Matrix Factorization. Toxics 2025, 13, 137. https://doi.org/10.3390/toxics13020137
Fornasaro S, Astel A, Barbieri P, Licen S. Disentangling Multiannual Air Quality Profiles Aided by Self-Organizing Map and Positive Matrix Factorization. Toxics. 2025; 13(2):137. https://doi.org/10.3390/toxics13020137
Chicago/Turabian StyleFornasaro, Stefano, Aleksander Astel, Pierluigi Barbieri, and Sabina Licen. 2025. "Disentangling Multiannual Air Quality Profiles Aided by Self-Organizing Map and Positive Matrix Factorization" Toxics 13, no. 2: 137. https://doi.org/10.3390/toxics13020137
APA StyleFornasaro, S., Astel, A., Barbieri, P., & Licen, S. (2025). Disentangling Multiannual Air Quality Profiles Aided by Self-Organizing Map and Positive Matrix Factorization. Toxics, 13(2), 137. https://doi.org/10.3390/toxics13020137