[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

On Interpretability and Similarity in Concept-Based Machine Learning

  • Conference paper
  • First Online:
Analysis of Images, Social Networks and Texts (AIST 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12602))

Abstract

Machine Learning (ML) provides important techniques for classification and predictions. Most of these are black-box models for users and do not provide decision-makers with an explanation. For the sake of transparency or more validity of decisions, the need to develop explainable/interpretable ML-methods is gaining more and more importance. Certain questions need to be addressed:

  • How does an ML procedure derive the class for a particular entity?

  • Why does a particular clustering emerge from a particular unsupervised ML procedure?

  • What can we do if the number of attributes is very large?

  • What are the possible reasons for the mistakes for concrete cases and models?

For binary attributes, Formal Concept Analysis (FCA) offers techniques in terms of intents of formal concepts, and thus provides plausible reasons for model prediction. However, from the interpretable machine learning viewpoint, we still need to provide decision-makers with the importance of individual attributes to the classification of a particular object, which may facilitate explanations by experts in various domains with high-cost errors like medicine or finance.

We discuss how notions from cooperative game theory can be used to assess the contribution of individual attributes in classification and clustering processes in concept-based machine learning. To address the 3rd question, we present some ideas on how to reduce the number of attributes using similarities in large contexts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    One of the earlier precursors of association rules can be also found in [17] under the name of “almost true implications”.

  2. 2.

    Similarity between concepts is discussed in [9].

  3. 3.

    https://github.com/dimachine/Shap4JSM.

  4. 4.

    \(S\uparrow \) is the up-set of S in the Boolean lattice \((\mathcal {P}\{Ma, Mi, Se, L\},\subseteq )\).

  5. 5.

    https://archive.ics.uci.edu/ml/datasets/zoo.

  6. 6.

    https://github.com/dimachine/ShapStab/.

References

  1. Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Buneman, P., Jajodia, S. (eds.) Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, DC, USA, 26–28 May 1993, pp. 207–216. ACM Press (1993)

    Google Scholar 

  2. Alves, G., Bhargava, V., Couceiro, M., Napoli, A.: Making ML models fairer through explanations: the case of limeout. CoRR abs/2011.00603 (2020)

    Google Scholar 

  3. Belohlávek, R., Baets, B.D., Konecny, J.: Granularity of attributes in formal concept analysis. Inf. Sci. 260, 149–170 (2014)

    Article  MathSciNet  Google Scholar 

  4. Belohlávek, R., Baets, B.D., Outrata, J., Vychodil, V.: Inducing decision trees via concept lattices. Int. J. Gen. Syst. 38(4), 455–467 (2009)

    Article  MathSciNet  Google Scholar 

  5. Belohlávek, R., Vychodil, V.: Discovery of optimal factors in binary data via a novel method of matrix decomposition. J. Comput. Syst. Sci. 76(1), 3–20 (2010)

    Article  MathSciNet  Google Scholar 

  6. Bocharov, A., Gnatyshak, D., Ignatov, D.I., Mirkin, B.G., Shestakov, A.: A lattice-based consensus clustering algorithm. In: Huchard, M., Kuznetsov, S.O. (eds.) Proceedings of the Thirteenth International Conference on Concept Lattices and Their Applications, Moscow, Russia, CEUR Workshop Proceedings, 18–22 July 2016, vol. 1624, pp. 45–56. CEUR-WS.org (2016)

    Google Scholar 

  7. Carpineto, C., Romano, G.: A lattice conceptual clustering system and its application to browsing retrieval. Mach. Learn. 24(2), 95–122 (1996)

    Google Scholar 

  8. Caruana, R., Lundberg, S., Ribeiro, M.T., Nori, H., Jenkins, S.: Intelligible and explainable machine learning: best practices and practical challenges. In: Gupta, R., Liu, Y., Tang, J., Prakash, B.A. (eds.) KDD 2020: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, CA, USA, 23–27 August 2020, pp. 3511–3512. ACM (2020)

    Google Scholar 

  9. Eklund, P.W., Ducrou, J., Dau, F.: Concept similarity and related categories in information retrieval using Formal Concept Analysis. Int. J. Gen. Syst. 41(8), 826–846 (2012)

    Article  MathSciNet  Google Scholar 

  10. Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery in databases. AI Mag. 17(3), 37–54 (1996)

    Google Scholar 

  11. Finn, V.: On machine-oriented formalization of plausible reasoning in F. Bacon-J.S. Mill Style. Semiotika i Informatika 20, 35–101 (1983). (in Russian)

    MATH  Google Scholar 

  12. Ganter, B., Kuznetsov, S.O.: Hypotheses and version spaces. In: Ganter, B., de Moor, A., Lex, W. (eds.) ICCS-ConceptStruct 2003. LNCS (LNAI), vol. 2746, pp. 83–95. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45091-7_6

    Chapter  Google Scholar 

  13. Ganter, B., Kuznetsov, S.O.: Scale coarsening as feature selection. In: Medina, R., Obiedkov, S. (eds.) ICFCA 2008. LNCS (LNAI), vol. 4933, pp. 217–228. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78137-0_16

    Chapter  Google Scholar 

  14. Ganter, B., Obiedkov, S.A.: Conceptual Exploration. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49291-8

    Book  MATH  Google Scholar 

  15. Ganter, B., Wille, R.: Formal Concept Analysis - Mathematical Foundations. Springer, Heidelberg (1999). https://doi.org/10.1007/978-3-642-59830-2

    Book  MATH  Google Scholar 

  16. Goodfellow, I.J., Bengio, Y., Courville, A.C.: Deep Learning. Adaptive Computation and Machine Learning. MIT Press, Cambridge (2016)

    MATH  Google Scholar 

  17. Hájek, P., Havel, I., Chytil, M.: The GUHA method of automatic hypotheses determination. Computing 1(4), 293–308 (1966)

    Article  Google Scholar 

  18. Ignatov, D.I.: Introduction to formal concept analysis and its applications in information retrieval and related fields. In: Braslavski, P., Karpov, N., Worring, M., Volkovich, Y., Ignatov, D.I. (eds.) RuSSIR 2014. CCIS, vol. 505, pp. 42–141. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25485-2_3

    Chapter  Google Scholar 

  19. Ignatov, D.I., Kuznetsov, S.O., Poelmans, J.: Concept-based biclustering for internet advertisement. In: 12th IEEE International Conference on Data Mining Workshops, ICDM Workshops, Brussels, Belgium, 10 December 2012, pp. 123–130 (2012)

    Google Scholar 

  20. Ignatov, D.I., Kwuida, L.: Interpretable concept-based classification with shapley values. In: Alam, M., Braun, T., Yun, B. (eds.) ICCS 2020. LNCS (LNAI), vol. 12277, pp. 90–102. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-57855-8_7

    Chapter  Google Scholar 

  21. Ignatov, D.I., Kwuida, L.: Shapley and banzhaf vectors of a formal concept. In: Valverde-Albacete, F.J., Trnecka, M. (eds.) Proceedings of the Fifthteenth International Conference on Concept Lattices and Their Applications, Tallinn, Estonia, CEUR Workshop Proceedings, June 29–July 1, 2020, vol. 2668, pp. 259–271. CEUR-WS.org (2020)

    Google Scholar 

  22. Ignatov, D.I., Nenova, E., Konstantinova, N., Konstantinov, A.V.: Boolean matrix factorisation for collaborative filtering: An FCA-based approach. In: Agre, G., Hitzler, P., Krisnadhi, A.A., Kuznetsov, S.O. (eds.) AIMSA 2014. LNCS (LNAI), vol. 8722, pp. 47–58. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10554-3_5

    Chapter  Google Scholar 

  23. John, S.: Mill, A System of Logic, Ratiocinative and Inductive, Being a Connected View of the Principles of Evidence and the Methods of Scientific Investigation. Green, and Co., Longmans, London (1843)

    Google Scholar 

  24. Kadyrov, T., Ignatov, D.I.: Attribution of customers’ actions based on machine learning approach. In: Proceedings of the Fifth Workshop on Experimental Economics and Machine Learning co-located with the Seventh International Conference on Applied Research in Economics (iCare7), Perm, Russia, 26 September 2019, vol-2479, pp. 77–88. CEUR-ws (2019)

    Google Scholar 

  25. Kashnitsky, Y., Kuznetsov, S.O.: Global optimization in learning with important data: an FCA-based approach. In: Huchard, M., Kuznetsov, S.O. (eds.) Proceedings of the Thirteenth International Conference on Concept Lattices and Their Applications, Moscow, Russia, CEUR Workshop Proceedings, 18–22 July 2016, vol. 1624, pp. 189–201. CEUR-WS.org (2016)

    Google Scholar 

  26. Kaur, H., Nori, H., Jenkins, S., Caruana, R., Wallach, H.M., Vaughan, J.W.: Interpreting interpretability: understanding data scientists’ use of interpretability tools for machine learning. In: Bernhaupt, R., et al. (eds.) CHI 2020: CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April, 2020, pp. 1–14. ACM (2020)

    Google Scholar 

  27. Kaytoue, M., Kuznetsov, S.O., Macko, J., Napoli, A.: Biclustering meets triadic concept analysis. Ann. Math. Artif. Intell. 70(1–2), 55–79 (2014)

    Article  MathSciNet  Google Scholar 

  28. Konecny, J.: On attribute reduction in concept lattices: methods based on discernibility matrix are outperformed by basic clarification and reduction. Inf. Sci. 415, 199–212 (2017)

    Article  Google Scholar 

  29. Konecny, J., Krajca, P.: On attribute reduction in concept lattices: experimental evaluation shows discernibility matrix based methods inefficient. Inf. Sci. 467, 431–445 (2018)

    Article  Google Scholar 

  30. Kuitché, R.S., Temgoua, R.E.A., Kwuida, L.: A similarity measure to generalize attributes. In: Ignatov, D.I., Nourine, L. (eds.) Proceedings of the Fourteenth International Conference on Concept Lattices and Their Applications, CLA 2018, Olomouc, Czech Republic, CEUR Workshop Proceedings, 12–14 June 2018, vol. 2123, pp. 141–152. CEUR-WS.org (2018)

    Google Scholar 

  31. Kuznetsov, S.O.: Machine learning and formal concept analysis. ICFCA 2004, 287–312 (2004)

    MATH  Google Scholar 

  32. Kuznetsov, S.O.: Galois connections in data analysis: contributions from the soviet era and modern Russian research. In: Ganter, B., Stumme, G., Wille, R. (eds.) Formal Concept Analysis. LNCS (LNAI), vol. 3626, pp. 196–225. Springer, Heidelberg (2005). https://doi.org/10.1007/11528784_11

    Chapter  Google Scholar 

  33. Kuznetsov, S.O.: On stability of a formal concept. Ann. Math. Artif. Intell. 49(1–4), 101–115 (2007)

    Article  MathSciNet  Google Scholar 

  34. Kuznetsov, S.O., Makhalova, T.P.: On interestingness measures of formal concepts. Inf. Sci. 442–443, 202–219 (2018)

    Article  MathSciNet  Google Scholar 

  35. Kuznetsov, S.O., Makhazhanov, N., Ushakov, M.: On neural network architecture based on concept lattices. In: Kryszkiewicz, M., Appice, A., Slezak, D., Rybinski, H., Skowron, A., Ras, Z.W. (eds.) ISMIS 2017. LNCS (LNAI), vol. 10352, pp. 653–663. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60438-1_64

  36. Kuznetsov, S.O., Poelmans, J.: Knowledge representation and processing with formal concept analysis. Wiley Interdiscip. Rev. Data Min. Knowl. Disc. 3(3), 200–215 (2013)

    Article  Google Scholar 

  37. Kuznetsov, S.: Jsm-method as a machine learning method. Method. Itogi Nauki i Tekhniki ser. Informatika 15, 17–53 (1991). (in Russian)

    Google Scholar 

  38. Kuznetsov, S.: Stability as an estimate of the degree of substantiation of hypotheses derived on the basis of operational similarity. Nauchn. Tekh. Inf. Ser. 2(12), 217–29 (1991). (in Russian)

    Google Scholar 

  39. Kuznetsov, S.: Mathematical aspects of concept analysis. J. Math. Sci. 80(2), 1654–1698 (1996)

    Article  MathSciNet  Google Scholar 

  40. Kwuida, L., Kuitché, R., Temgoua, R.: On the size of \(\exists \)-generalized concepts. ArXiv:1709.08060 (2017)

  41. Kwuida, L., Kuitché, R.S., Temgoua, R.E.A.: On the size of \(\exists \)-generalized concept lattices. Discret. Appl. Math. 273, 205–216 (2020)

    Article  MathSciNet  Google Scholar 

  42. Kwuida, L., Missaoui, R., Balamane, A., Vaillancourt, J.: Generalized pattern extraction from concept lattices. Ann. Math. Artif. Intell. 72(1–2), 151–168 (2014)

    Article  MathSciNet  Google Scholar 

  43. Kwuida, L., Missaoui, R., Boumedjout, L., Vaillancourt, J.: Mining generalized patterns from large databases using ontologies (2009). ArXiv:0905.4713

  44. Kwuida, L., Missaoui, R., Vaillancourt, J.: Using taxonomies on objects and attributes to discover generalized patterns. In: Szathmary, L., Priss, U. (eds.) Proceedings of The Ninth International Conference on Concept Lattices and Their Applications, Fuengirola (Málaga), CEUR Workshop Proceedings, Spain, 11–14 October 2012, vol. 972, pp. 327–338. CEUR-WS.org (2012)

    Google Scholar 

  45. Lakhal, L., Stumme, G.: Efficient mining of association rules based on formal concept analysis. In: Ganter, B., Stumme, G., Wille, R. (eds.) Formal Concept Analysis. LNCS (LNAI), vol. 3626, pp. 180–195. Springer, Heidelberg (2005). https://doi.org/10.1007/11528784_10

    Chapter  MATH  Google Scholar 

  46. Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Guyon, I. et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 4765–4774. Curran Associates, Inc. (2017)

    Google Scholar 

  47. Luxenburger, M.: Implications partielles dans un contexte. Mathématiques et Sci. Humaines. 113, 35–55 (1991)

    MathSciNet  MATH  Google Scholar 

  48. Mirkin, B.: Mathematical Classification and Clustering. Kluwer Academic Publishers, Amsterdam (1996)

    Book  Google Scholar 

  49. Mitchell, T.M.: Version spaces: a candidate elimination approach to rule learning. In: Reddy, R. (ed.) Proceedings of the 5th International Joint Conference on Artificial Intelligence 1977, pp. 305–310. William Kaufmann (1977)

    Google Scholar 

  50. Molnar, C.: Interpretable Machine Learning (2019). https://christophm.github.io/interpretable-ml-book/

  51. Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Inf. Syst. 24(1), 25–46 (1999)

    Article  Google Scholar 

  52. Poelmans, J., Ignatov, D.I., Kuznetsov, S.O., Dedene, G.: Formal concept analysis in knowledge processing: a survey on applications. Expert Syst. Appl. 40(16), 6538–6560 (2013)

    Article  Google Scholar 

  53. Poelmans, J., Kuznetsov, S.O., Ignatov, D.I., Dedene, G.: Formal concept analysis in knowledge processing: a survey on models and techniques. Expert Syst. Appl. 40(16), 6601–6623 (2013)

    Article  Google Scholar 

  54. Prediger, S.: Formal concept analysis for general objects. Discret. Appl. Math. 127(2), 337–355 (2003)

    Article  MathSciNet  Google Scholar 

  55. Priss, U., Old, L.J.: Data weeding techniques applied to Roget’s thesaurus. In: Wolff, K.E., Palchunov, D.E., Zagoruiko, N.G., Andelfinger, U. (eds.) KONT/KPP -2007. LNCS (LNAI), vol. 6581, pp. 150–163. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22140-8_10

    Chapter  Google Scholar 

  56. Roth, C., Obiedkov, S., Kourie, D.: Towards concise representation for taxonomies of epistemic communities. In: Yahia, S.B., Nguifo, E.M., Belohlavek, R. (eds.) CLA 2006. LNCS (LNAI), vol. 4923, pp. 240–255. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78921-5_17

    Chapter  MATH  Google Scholar 

  57. Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell 1(5), 206–215 (2019)

    Article  Google Scholar 

  58. Rudolph, S.: Using FCA for encoding closure operators into neural networks. In: Priss, U., Polovina, S., Hill, R. (eds.) ICCS-ConceptStruct 2007. LNCS (LNAI), vol. 4604, pp. 321–332. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73681-3_24

    Chapter  Google Scholar 

  59. Shapley, L.S.: A value for n-person games. Contrib. Theory Games 2(28), 307–317 (1953)

    MathSciNet  MATH  Google Scholar 

  60. Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 70, pp. 3145–3153. PMLR, International Convention Centre, Sydney (2017)

    Google Scholar 

  61. Srikant, R., Agrawal, R.: Mining generalized association rules. In: Dayal, U., Gray, P.M.D., Nishio, S. (eds.) VLDB 95, Proceedings of 21th International Conference on Very Large Data Bases, Zurich, Switzerland, 11–15 September 1995, pp. 407–419. Morgan Kaufmann (1995)

    Google Scholar 

  62. Srikant, R., Agrawal, R.: Mining generalized association rules. Future Gener. Comput. Syst. 13(2–3), 161–180 (1997)

    Article  Google Scholar 

  63. Strumbelj, E., Kononenko, I.: Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst. 41(3), 647–665 (2014)

    Article  Google Scholar 

  64. Stumme, G., Taouil, R., Bastide, Y., Lakhal, L.: Conceptual clustering with iceberg concept lattices. In: Proceedings of GI-Fachgruppentreffen Maschinelles Lernen, vol. 1 (2001)

    Google Scholar 

  65. Tatti, N., Moerchen, F.: Finding robust itemsets under subsampling. ICDM 2011, 705–714 (2011)

    Google Scholar 

  66. Valtchev, P., Missaoui, R.: Similarity-based clustering versus galois lattice building: strengths and weaknesses. In: Huchard, M., Godin, R., Napoli, A. (eds.) Contributions of the ECOOP 2000 Workshop, “Objects and Classification: a Natural Convergence", European Conference on Object-Oriented Programming (2000), vol. Research Report LIRMM, no. 00095, p. w13 (2000)

    Google Scholar 

Download references

Acknowledgements

The study was implemented in the framework of the Basic Research Program at the National Research University Higher School of Economics and funded by the Russian Academic Excellence Project ‘5–100’. The second author was also supported by Russian Science Foundation under grant 17-11-01276 at St. Petersburg Department of Steklov Mathematical Institute of Russian Academy of Sciences, Russia. The second author would like to thank Fuad Aleskerov, Alexei Zakharov, and Shlomo Weber for the inspirational lectures on Collective Choice and Voting Theory.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Léonard Kwuida .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kwuida, L., Ignatov, D.I. (2021). On Interpretability and Similarity in Concept-Based Machine Learning. In: van der Aalst, W.M.P., et al. Analysis of Images, Social Networks and Texts. AIST 2020. Lecture Notes in Computer Science(), vol 12602. Springer, Cham. https://doi.org/10.1007/978-3-030-72610-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-72610-2_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-72609-6

  • Online ISBN: 978-3-030-72610-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics