[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content

Advertisement

Log in

Explainability with Association Rule Learning for Weather Forecast

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

The reliability of the weather forecast models is a complex issue since it depends on numerous parameters and the technical infrastructure which supports them. In doing so, there is a need for advanced works oriented towards a better understanding of these models and the analysis of main associated parameters. Our approach is to study the applicability of the extracted association rules to provide a clearer understanding of atmospheric exchanges. In this work, the proposed methodology is based on the discovery of the interesting interpretable relationships between measured meteorological parameters at the Atmospheric Research Center of Lannemezan (South-West of France). In the preprocessing step, the proposed method is considered to be effectively flexible to account for data uncertainties, unlike the majority of classical evaluation methods mainly directed towards the reduction of variables and data redundancy. In postprocessing, the advantage of our approach is that the extracted rules are a metamodeling of interpretable useful knowledge for the clarity and conciseness of its representation. Moreover, in the processing, the interpretability in data sciences is recent and still in its infancy. The generated association rules with their statistical and semantic interpretations have globally highlighted the possibilities of explicit analysis of meteorological parameters. This study showed that among the generated relevant rules, three parameters (temperature, humidity, wind speed) have a high frequency in the antecedents of the rules and that the only consequence is rain. This is useful for the identification of potential improvements and gaps in the existing models of atmospheric observations, in particular, to understand the related parameterizations to the productivity of the rain phenomenon.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Andreas A, Ackerman M, Brownstein NC. To cluster, or not to cluster: an analysis of clusterability methods. Pattern Recogn. 2019;88:13–26.

    Article  Google Scholar 

  2. Ajak AD, Lilford E, Topal E. Application of predictive data mining to create mine plan flexibility in the face of geological uncertainty. Resour Policy. 2017;55:62–79.

    Article  Google Scholar 

  3. Agrawal R, Imielinski T, Swami A. Mining associations between sets of items in large databases. In: ACM SIGMOD int’l conference on management of data, Washington D.C.; 1993, pp. 207–16.

  4. Arnaud P, Cantet P, Odry J. Uncertainties of flood frequency estimation approaches based on continuous simulation using data resampling. J Hydrol. 2017;554:360–9.

    Article  Google Scholar 

  5. Azimi R, Ghofrani M, Ghayekhloo M. A hybrid wind power forecasting model based on data mining and wavelets analysis. Energy Convers Manag. 2016;127:208–25.

    Article  Google Scholar 

  6. Bandaru S, Ng AHC, Deb K. Data mining methods for knowledge discovery in multi-objective optimization: part A—survey. Expert Syst Appl. 2017;70:139–59.

    Article  Google Scholar 

  7. Beierle C. Management of uncertainty in Artificial Intelligence and databases. Int J Approx Reason. 2017;86:24–5.

    Article  MathSciNet  MATH  Google Scholar 

  8. Bilalli B, Abelló A, Aluja-Banet T, Wrembel R. Intelligent assistance for data pre-processing. Comput Stand Interfaces. 2018;57:101–9.

    Article  Google Scholar 

  9. Bourdeau M, Zhai X, Nefzaoui E, Guo X, Chatellier P. Modeling and forecasting building energy consumption: a review of data-driven techniques. Sustain Cities Soc. 2019;48:101533.

    Article  Google Scholar 

  10. Borah A, Nath B. Identifying risk factors for adverse diseases using dynamic rare association rule mining. Expert Syst Appl. 2018;113:233–63.

    Article  Google Scholar 

  11. Chemchem A, Drias H. From data mining to knowledge mining: application to intelligent agents. Expert Syst Appl. 2015;42(3):1436–45.

    Article  Google Scholar 

  12. Xiaobo C, Wei Z, Li Z, Liang J, Cai Y, Zhang B. Ensemble correlation-based low-rank matrix completion with applications to traffic data imputation. Knowl-Based Syst. 2017;132(15):249–62.

    Google Scholar 

  13. Crone SF, Lessmann S, Stahlbock R. The impact of preprocessing on data mining: an evaluation of classifier sensitivity in direct marketing. Eur J Operational Res. 2006;173(3):781–800.

    Article  MathSciNet  MATH  Google Scholar 

  14. De Mauro A, Greco M, Grimaldi M, Ritala P. Human resources for Big Data professions: a systematic classification of job roles and required skill sets. Inf Process Manag. 2018;54(5):807–17.

    Article  Google Scholar 

  15. Djenouri Y, Comuzzi M. Combining Apriori heuristic and bio-inspired algorithms for solving the frequent itemsets mining problem. Inf Sci. 2017;420(2017):1–15.

    Article  Google Scholar 

  16. Djenouri Y, Belhadi A, Fournier-Viger P, Fujita H. Mining diversified association rules in big datasets: a cluster/GPU/genetic approach. Inf Sci. 2018;459:117–34.

    Article  MathSciNet  Google Scholar 

  17. Doostan and Chowdhury, 2017. Milad Doostan, Badrul H. Chowdhury. Power distribution system fault cause analysis by using association rule mining. Electric Power Systems Research, Volume 152, November 2017, Pages 140–147.

  18. Figueiredo LNL, de Assis GT, Ferreira AA. DERIN: a data extraction method based on rendering information and n-gram. Inf Process Manag. 2017;53(5):1120–38.

    Article  Google Scholar 

  19. García S, Luengo J, Herrera F. Tutorial on practical tips of the most influential data preprocessing algorithms in data mining. Knowl-Based Syst. 2016;98:1–29.

    Article  Google Scholar 

  20. García-Gil D, Luengo J, García S, Herrera F. Enabling smart data: noise filtering in big data classification. Inf Sci. 2019;479:135–52.

    Article  Google Scholar 

  21. Fan C, Ding Y, Liao Y. Analysis of hourly cooling load prediction accuracy with data-mining approaches on different training time scales. Sustain Cities Soc. 2019;51:101717.

    Article  Google Scholar 

  22. Gupta A, Datta S, Das S. Fast automatic estimation of the number of clusters from the minimum inter-center distance for k-means clustering. Pattern Recogn Lett. 2018;116(1):72–9.

    Article  Google Scholar 

  23. Henriques R, Antunes C, Madeira SC. A structured view on pattern mining-based biclustering. Pattern Recognit. 2015;48(12):3941–58.

    Article  Google Scholar 

  24. Huang C, Lu R, Choo K-KR. Secure and flexible cloud-assisted association rule mining over horizontally partitioned databases. J Comput Syst Sci. 2017;89:51–63.

    Article  MathSciNet  MATH  Google Scholar 

  25. Kamsu-Foguem B, Rigal F, Mauget F. Mining association rules for the quality improvement of the production process. Expert Syst Appl. 2013;40(4):1034–45.

    Article  Google Scholar 

  26. Karmitsa N, Bagirov AM, Taheri S. New diagonal bundle method for clustering problems in large data sets. Eur J Oper Res. 2017;263(2):367–79.

    Article  MathSciNet  MATH  Google Scholar 

  27. Khader N, Lashier A, Yoon SW. Pharmacy robotic dispensing and planogram analysis using association rule mining with prescription data. Expert Syst Appl. 2016;57:296–310.

    Article  Google Scholar 

  28. Li R, Jiang P, Yang H, Li C. A novel hybrid forecasting scheme for electricity demand time series. Sustain Cities Soc. 2020;55:102036.

    Article  Google Scholar 

  29. Li W-P, Yang J, Zhang J-P. Uncertain canonical correlation analysis for multi-view feature extraction from uncertain data streams. Neurocomputing. 2015;149(Part C):1337–47.

    Article  Google Scholar 

  30. Liu K, Liu T-Z, Jian P, Lin Y. The re-optimization strategy of multi-layer hybrid building’s cooling and heating load soft sensing technology research based on temperature interval and hierarchical modeling techniques. Sustain Cities Soc. 2018;38:42–54.

    Article  Google Scholar 

  31. Liao S, Chang H. A rough set-based association rule approach for a recommendation system for online consumers. Inf Process Manag. 2016;52(6):1142–60.

    Article  Google Scholar 

  32. Lin JC-W, Gan W, Fournier-Viger P, Hong T-P, Tseng VS. Efficient algorithms for mining high-utility itemsets in uncertain databases. Knowl-Based Syst. 2016;96:171–87.

    Article  Google Scholar 

  33. Loría-Salazar SM, Panorska A, Arnott WP, Barnard JC, Boehmler JM, Holmes HA. Toward understanding atmospheric physics impacting the relationship between columnar aerosol optical depth and near-surface PM2.5 mass concentrations in Nevada and California, U.S.A., during 201. Atmos Environ. 2017;171:289–300.

    Article  Google Scholar 

  34. Narvekar M, Syed SF. An optimized algorithm for association rule mining using FP tree. Proc Comput Sci. 2015;45(2015):101–10.

    Article  Google Scholar 

  35. Pei B, Zhao S, Chen H, Zhou X, Chen D. FARP: Mining fuzzy association rules from a probabilistic quantitative database. Inf Sci. 2013;237:242–60.

    Article  MathSciNet  Google Scholar 

  36. Petrollese M, Cau G, Cocco D. Use of weather forecast for increasing the self-consumption rate of home solar systems: an Italian case study. Appl Energy. 2018;212(15):746–58.

    Article  Google Scholar 

  37. Massana J, Pous C, Burgas L, Melendez J, Colomer J. Identifying services for short-term load forecasting using data driven models in a Smart City platform. Sustain Cities Soc. 2017;28:108–17.

    Article  Google Scholar 

  38. Pereira RB, Plastino A, Zadrozny B, Merschmann LHC. Correlation analysis of performance measures for multi-label classification. Inf Process Manag. 2018;54(3):359–69.

    Article  Google Scholar 

  39. Ramírez-Gallego S, Krawczyk B, García S, Woźniak M, Herrera F. A survey on data preprocessing for data stream mining: current status and future directions. Neurocomputing. 2017;239:39–57.

    Article  Google Scholar 

  40. Ristoski P, Paulheim H. Semantic Web in data mining and knowledge discovery: a comprehensive survey. Web Semant Sci Serv Agents World Wide Web. 2016;36:1–22.

    Article  Google Scholar 

  41. Saggi MK, Jain S. A survey towards an integration of big data analytics to big insights for value-creation. Inf Process Manag. 2018;54(5):758–90.

    Article  Google Scholar 

  42. Shi F, Peng X, Liu Z, Li E, Hu Y. A data-driven approach for pipe deformation prediction based on soil properties and weather conditions. Sustain Cities Soc. 2020;55:102012.

    Article  Google Scholar 

  43. Singh S, Garg R, Mishra PK. Performance optimization of MapReduce-based Apriori algorithm on Hadoop cluster. Comput Electric Eng. 2017;67:348–64.

    Article  Google Scholar 

  44. Talaat M, Alsayyari AS, Alblawi A, Hatata AY. Hybrid-cloud-based data processing for power system monitoring in smart grids. Sustain Cities Soc. 2020;55:102049.

    Article  Google Scholar 

  45. Ahmad T, Chen H. A review on machine learning forecasting growth trends and their real-time applications in different energy systems. Sustain Cities Soc. 2020;54:102010.

    Article  Google Scholar 

  46. Ahmad T, Chen H. Utility companies strategy for short-term energy demand forecasting using machine learning based models. Sustain Cities Soc. 2018;39:401–17.

    Article  Google Scholar 

  47. Vadim K. Overview of different approaches to solving problems of data mining. Proc Comput Sci. 2018;123(2018):234–9.

    Article  Google Scholar 

  48. Valverde-Rebaza JC, Roche M, Poncelet P, de Lopes AA. The role of location and social strength for friendship prediction in location-based social networks. Inf Process Manag. 2018;54(4):475–89.

    Article  Google Scholar 

  49. Yesilbudak M, Sagiroglu S, Colak I. A novel implementation of kNN classifier based on multi-tupled meteorological input data for wind power prediction. Energy Convers Manag. 2017;135:434–44.

    Article  Google Scholar 

  50. Zarzo M, Martí P. Modeling the variability of solar radiation data among weather stations by means of principal components analysis. Appl Energy. 2011;88:2775–84.

    Article  Google Scholar 

  51. Zhang X, He L, Zhang J, Whiting MD, Karkee M, Zhang Q. Determination of key canopy parameters for mass mechanical apple harvesting using supervised machine learning and principal component analysis (PCA). Biosys Eng. 2020;193:247–63.

    Article  Google Scholar 

  52. Zhang Z, Pedrycz W, Huang J. Efficient mining product-based fuzzy association rules through central limit theorem. Appl Soft Comput. 2018;63:235–48.

    Article  Google Scholar 

  53. Zhao C, Song G. Application of data mining to the analysis of meteorological data for air quality prediction: a case study in Shenyang. In: IOP conference series: earth and environmental science, Vol. 81, conference 1; 2017.

  54. Zhen Q, Deng Y, Wang Y, Wang X, Zhang H, Sun X, Ouyang Z. Meteorological factors had more impact on airborne bacterial communities than air pollutants. Sci Total Environ. 2017;601–602:703–12.

    Article  Google Scholar 

  55. Zhu J, Shen Y, Song Z, Zhou D, Kusiak A. Data-driven building load profiling and energy management. Sustain Cities Soc. 2019;49:101587.

    Article  Google Scholar 

  56. Zhu E, Ma R. An effective partitional clustering algorithm based on new clustering validity index. Appl Soft Comput. 2018;71:608–21.

    Article  Google Scholar 

Download references

Funding

This research is funded by the Malian Ministry of National Education, Ecole Normale d’Enseignement Technique et Professionnel (ENETP) de l’Université de Bamako in Mali and the French Embassy in Mali through Cultural and Cooperation Department (SCAC). The grant IDs of the research is N◦ 954749E with the reference PRISME 0185MLIB190027.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bernard Kamsu-Foguem.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Coulibaly, L., Kamsu-Foguem, B. & Tangara, F. Explainability with Association Rule Learning for Weather Forecast. SN COMPUT. SCI. 2, 116 (2021). https://doi.org/10.1007/s42979-021-00525-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-021-00525-8

Keywords

Navigation