[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/3666122.3666718guideproceedingsArticle/Chapter ViewAbstractPublication PagesnipsConference Proceedingsconference-collections
research-article

Reproducibility in multiple instance learning: a case for algorithmic unit tests

Published: 30 May 2024 Publication History

Abstract

Multiple Instance Learning (MIL) is a sub-domain of classification problems with positive and negative labels and a "bag" of inputs, where the label is positive if and only if a positive element is contained within the bag, and otherwise is negative. Training in this context requires associating the bag-wide label to instance-level information, and implicitly contains a causal assumption and asymmetry to the task (i.e., you can't swap the labels without changing the semantics). MIL problems occur in healthcare (one malignant cell indicates cancer), cyber security (one malicious executable makes an infected computer), and many other tasks. In this work, we examine five of the most prominent deep-MIL models and find that none of them respects the standard MIL assumption. They are able to learn anti-correlated instances, i.e., defaulting to "positive" labels until seeing a negative counter-example, which should not be possible for a correct MIL model. We suspect that enhancements and other works derived from these models will share the same issue. In any context in which these models are being used, this creates the potential for learning incorrect models, which creates risk of operational failure. We identify and demonstrate this problem via a proposed "algorithmic unit test", where we create synthetic datasets that can be solved by a MIL respecting model, and which clearly reveal learning that violates MIL assumptions. The five evaluated methods each fail one or more of these tests. This provides a model-agnostic way to identify violations of modeling assumptions, which we hope will be useful for future development and evaluation of MIL models.

Supplementary Material

Additional material (3666122.3666718_supp.pdf)
Supplemental material.

References

[1]
G. D. Poore, E. Kopylova, Q. Zhu, C. Carpenter, S. Fraraccio, S. Wandro, T. Kosciolek, S. Janssen, J. Metcalf, S. J. Song, J. Kanbar, S. Miller-Montgomery, R. Heaton, R. Mckay, S. P. Patel, A. D. Swafford, and R. Knight, "Microbiome analyses of blood and tissues suggest cancer diagnostic approach," Nature, vol. 579, no. 7800, pp. 567-574, Mar. 2020. [Online]. Available:
[2]
A. Gihawi, Y. Ge, J. Lu, D. Puiu, A. Xu, C. S. Cooper, D. S. Brewer, M. Pertea, and S. L. Salzberg, "Major data analysis errors invalidate cancer microbiome findings," mBio, Oct. 2023. [Online]. Available:
[3]
G. Varoquaux and V. Cheplygina, "Machine learning for medical imaging: methodological failures and recommendations for the future," npj Digital Medicine, vol. 5, no. 1, Apr. 2022. [Online]. Available:
[4]
L. Wynants, B. V. Calster, G. S. Collins, R. D. Riley, G. Heinze, E. Schuit, E. Albu, B. Arshi, V. Bellou, M. M. J. Bonten, D. L. Dahly, J. A. Damen, T. P. A. Debray, V. M. T. de Jong, M. D. Vos, P. Dhiman, J. Ensor, S. Gao, M. C. Haller, M. O. Harhay, L. Henckaerts, P. Heus, J. Hoogland, M. Hudda, K. Jenniskens, M. Kammer, N. Kreuzberger, A. Lohmann, B. Levis, K. Luijken, J. Ma, G. P. Martin, D. J. McLernon, C. L. A. Navarro, J. B. Reitsma, J. C. Sergeant, C. Shi, N. Skoetz, L. J. M. Smits, K. I. E. Snell, M. Sperrin, R. Spijker, E. W. Steyerberg, T. Takada, I. Tzoulaki, S. M. J. van Kuijk, B. C. T. van Bussel, I. C. C. van der Horst, K. Reeve, F. S. van Royen, J. Y. Verbakel, C. Wallisch, J. Wilkinson, R. Wolff, L. Hooft, K. G. M. Moons, and M. van Smeden, "Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal," BMJ, p. m1328, Apr. 2020. [Online]. Available:
[5]
W. Fleshman, E. Raff, J. Sylvester, S. Forsyth, and M. McLean, "Non-negative networks against adversarial attacks," 2019.
[6]
I. n. Íncer Romeo, M. Theodorides, S. Afroz, and D. Wagner, "Adversarially robust malware detection using monotonic classification," in Proceedings of the Fourth ACM International Workshop on Security and Privacy Analytics, ser. IWSPA '18. New York, NY, USA: Association for Computing Machinery, 2018, p. 54-63. [Online]. Available:
[7]
D. Lowd and C. Meek, "Good word attacks on statistical spam filters." in CEAS, 2005. [Online]. Available: http://dblp.uni-trier.de/db/conf/ceas/ceas2005.html#LowdM05
[8]
J. Stiborek, T. Pevný, and M. Rehák, "Multiple instance learning for malware classification," Expert Systems with Applications, vol. 93, pp. 346-357, mar 2018. [Online]. Available:
[9]
T. Pevny and M. Dedic, "Nested multiple instance learning in modelling of http network traffic," 2020.
[10]
Z. Jorgensen, Y. Zhou, and M. Inge, "A multiple instance learning strategy for combating good word attacks on spam filters," Journal of Machine Learning Research, vol. 9, no. 38, pp. 1115-1146, 2008. [Online]. Available: http://jmlr.org/papers/v9/jorgensen08a.html
[11]
L. Beyer, O. J. Hénaff, A. Kolesnikov, X. Zhai, and A. van den Oord, "Are we done with imagenet?" 2020.
[12]
B. Barz and J. Denzler, "Do we train on test data? purging CIFAR of near-duplicates," Journal of Imaging, vol. 6, no. 6, p. 41, jun 2020. [Online]. Available:
[13]
R. S. Geiger, D. Cope, J. Ip, M. Lotosh, A. Shah, J. Weng, and R. Tang, ""garbage in, garbage out" revisited: What do machine learning application papers report about human-labeled training data?" Quantitative Science Studies, vol. 2, no. 3, pp. 795-827, 2021. [Online]. Available:
[14]
A. F. Cooper, Y. Lu, J. Forde, and C. M. De Sa, "Hyperparameter optimization is deceiving us, and how to stop it," in Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, Eds., vol. 34. Curran Associates, Inc., 2021, pp. 3081-3095. [Online]. Available: https://proceedings.neurips.cc/paper_files/ paper/2021/file/17fafe5f6ce2f1904eb09d2e80a4cbf6-Paper.pdf
[15]
X. Bouthillier, P. Delaunay, M. Bronzi, A. Trofimov, B. Nichyporuk, J. Szeto, N. Mohammadi Sepahvand, E. Raff, K. Madan, V. Voleti, S. Ebrahimi Kahou, V. Michalski, T. Arbel, C. Pal, G. Varoquaux, and P. Vincent, "Accounting for variance in machine learning benchmarks," in Proceedings of Machine Learning and Systems, A. Smola, A. Dimakis, and I. Stoica, Eds., vol. 3, 2021, pp. 747-769. [Online]. Available: https://proceedings.mlsys.org/ paper_files/paper/2021/file/0184b0cd3cfb185989f858a1d9f5c1eb-Paper.pdf
[16]
R. Dror, G. Baumer, M. Bogomolov, and R. Reichart, "Replicability analysis for natural language processing: Testing significance with multiple datasets," Transactions of the Association for Computational Linguistics, vol. 5, pp. 471-486, 2017. [Online]. Available: https://aclanthology.org/Q17-1033
[17]
A. Benavoli, G. Corani, and F. Mangili, "Should we really use post-hoc tests based on mean-ranks?" Journal of Machine Learning Research, vol. 17, no. 5, pp. 1-10, 2016. [Online]. Available: http://jmlr.org/papers/v17/benavoli16a.html
[18]
J. Demšar, "Statistical comparisons of classifiers over multiple data sets," Journal of Machine Learning Research, vol. 7, no. 1, pp. 1-30, 2006. [Online]. Available: http://jmlr.org/papers/v7/demsar06a.html
[19]
K. Ahn, P. Jain, Z. Ji, S. Kale, P. Netrapalli, and G. I. Shamir, "Reproducibility in optimization: Theoretical framework and limits," in Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds., vol. 35. Curran Associates, Inc., 2022, pp. 18 022-18 033. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2022/file/7274ed909a312d4d869cc328ad1c5f04-Paper-Conference.pdf
[20]
E. Raff, "A step toward quantifying independently reproducible machine learning research," in Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, Eds., vol. 32. Curran Associates, Inc., 2019. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2019/file/c429429bf1f2af051f2021dc92a8ebea-Paper.pdf
[21]
X. Bouthillier, C. Laurent, and P. Vincent, "Unreproducible research is reproducible," in Proceedings of the 36th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, K. Chaudhuri and R. Salakhutdinov, Eds., vol. 97. PMLR, 09-15 Jun 2019, pp. 725-734. [Online]. Available: https://proceedings.mlr.press/v97/bouthillier19a.html
[22]
E. Raff and A. L. Farris, "A siren song of open source reproducibility, examples from machine learning," in Proceedings of the 2023 ACM Conference on Reproducibility and Replicability, ser. ACM REP '23. New York, NY, USA: Association for Computing Machinery, 2023, p. 115-120. [Online]. Available:
[23]
M. Ferrari Dacrema, P. Cremonesi, and D. Jannach, "Are we really making much progress? a worrying analysis of recent neural recommendation approaches," in Proceedings of the 13th ACM Conference on Recommender Systems, ser. RecSys '19. New York, NY, USA: Association for Computing Machinery, 2019, p. 101-109. [Online]. Available:
[24]
F. Lu, E. Raff, and J. Holt, "A coreset learning reality check," Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 7, pp. 8940-8948, Jun. 2023. [Online]. Available:
[25]
M.-A. Carbonneau, V. Cheplygina, E. Granger, and G. Gagnon, "Multiple instance learning: A survey of problem characteristics and applications," Pattern Recognition, vol. 77, pp. 329-353, May 2018. [Online]. Available:
[26]
D. Grahn, "mil-benchmarks: Standardized evaluation of deep multiple-instance learning techniques," 2021.
[27]
J. Foulds and E. Frank, "A review of multi-instance learning assumptions," The Knowledge Engineering Review, vol. 25, no. 1, pp. 1-25, mar 2010. [Online]. Available:
[28]
S. Andrews, I. Tsochantaridis, and T. Hofmann, "Support vector machines for multiple-instance learning," in Advances in Neural Information Processing Systems, S. Becker, S. Thrun, and K. Obermayer, Eds., vol. 15. MIT Press, 2002. [Online]. Available: https://proceedings. neurips.cc/paper_files/paper/2002/file/3e6260b81898beacda3d16db379ed329-Paper.pdf
[29]
S. Ray and M. Craven, "Supervised versus multiple instance learning," in Proceedings of the 22nd international conference on Machine learning - ICML '05. ACM Press, 2005. [Online]. Available:
[30]
Z.-H. Zhou and J.-M. Xu, "On the relation between multi-instance learning and semi-supervised learning," in Proceedings of the 24th International Conference on Machine Learning, ser. ICML '07. New York, NY, USA: Association for Computing Machinery, 2007, p. 1167-1174. [Online]. Available:
[31]
Z.-H. Zhou and J.-M. Xu, "On the relation between multi-instance learning and semi-supervised learning," in Proceedings of the 24th international conference on Machine learning. ACM, jun 2007. [Online]. Available:
[32]
O. L. Mangasarian and E. W. Wild, "Multiple instance classification via successive linear programming," Journal of Optimization Theory and Applications, vol. 137, no. 3, pp. 555-568, dec 2007. [Online]. Available:
[33]
Z.-H. Zhou and M.-L. Zhang, "Neural networks for multi-instance learning," in Proceedings of the International Conference on Intelligent Information Technology, Beijing, China. Citeseer, 2002, pp. 455-459.
[34]
X. Wang, Y. Yan, P. Tang, X. Bai, and W. Liu, "Revisiting multiple instance neural networks," Pattern Recognition, vol. 74, pp. 15-24, feb 2018. [Online]. Available:
[35]
M. Ilse, J. Tomczak, and M. Welling, "Attention-based deep multiple instance learning," in Proceedings of the 35th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, J. Dy and A. Krause, Eds., vol. 80. PMLR, 10-15 Jul 2018, pp. 2127-2136. [Online]. Available: https://proceedings.mlr.press/v80/ilse18a.html
[36]
M. Tu, J. Huang, X. He, and B. Zhou, "Multiple instance learning with graph neural networks," 2019.
[37]
Z. Shao, H. Bian, Y. Chen, Y. Wang, J. Zhang, X. Ji, and y. zhang, "Transmil: Transformer based correlated multiple instance learning for whole slide image classification," in Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, Eds., vol. 34. Curran Associates, Inc., 2021, pp. 2136-2147. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2021/file/10c272d06794d3e5785d5e7c5356e9ff-Paper.pdf
[38]
M. Widrich, B. Schäfl, M. Pavlović, H. Ramsauer, L. Gruber, M. Holzleitner, J. Brandstetter, G. K. Sandve, V. Greiff, S. Hochreiter, and G. Klambauer, "Modern hopfield networks and attention for immune repertoire classification," in Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds., vol. 33. Curran Associates, Inc., 2020, pp. 18 832-18 845. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2020/file/da4902cb0bc38210839714ebdcf0efc3-Paper.pdf
[39]
S. Pal, A. Valkanas, F. Regol, and M. Coates, "Bag graph: Multiple instance learning using bayesian graph neural networks," Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 7, pp. 7922-7930, Jun. 2022. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/20762
[40]
Y. Yan, X. Wang, X. Guo, J. Fang, W. Liu, and J. Huang, "Deep multi-instance learning with dynamic pooling," in Proceedings of The 10th Asian Conference on Machine Learning, ser. Proceedings of Machine Learning Research, J. Zhu and I. Takeuchi, Eds., vol. 95. PMLR, 14-16 Nov 2018, pp. 662-677. [Online]. Available: https://proceedings.mlr.press/v95/yan18a.html
[41]
Y. Lin, M. Moosaei, and H. Yang, "OutfitNet: Fashion outfit recommendation with attention-based multiple instance learning," in Proceedings of The Web Conference 2020. ACM, apr 2020. [Online]. Available:
[42]
X. Shi, F. Xing, Y. Xie, Z. Zhang, L. Cui, and L. Yang, "Loss-based attention for deep multiple instance learning," Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 04, pp. 5742-5749, apr2020. [Online]. Available:
[43]
K. Thandiackal, B. Chen, P. Pati, G. Jaume, D. F. K. Williamson, M. Gabrani, and O. Goksel, "Differentiable zooming for multiple instance learning on whole-slide images," in Lecture Notes in Computer Science. Springer Nature Switzerland, 2022, pp. 699-715. [Online]. Available:
[44]
W. Zhang, X. Zhang, h. deng, and M.-L. Zhang, "Multi-instance causal representation learning for instance label prediction and out-of-distribution generalization," in Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds., vol. 35. Curran Associates, Inc., 2022, pp. 34940-34953. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2022/file/e261e92e1cfb820da930ad8c38d0aead-Paper-Conference.pdf
[45]
W. Zhang, L. Liu, and J. Li, Robust Multi-Instance Learning with Stable Instances. IOS Press, 2020, p. 1682-1689, citation Key: zhangRobustMultiInstanceLearning2020. [Online]. Available: https://ebooks.iospress.nl/doi/10.3233/FAIA200280
[46]
T. G. Dietterich, R. H. Lathrop, and T. Lozano-Pérez, "Solving the multiple instance problem with axis-parallel rectangles," Artificial Intelligence, vol. 89, no. 1-2, pp. 31-71, jan 1997. [Online]. Available:
[47]
N. Weidmann, E. Frank, and B. Pfahringer, "A two-level learning method for generalized multi-instance problems," in Machine Learning: ECML 2003. Springer Berlin Heidelberg, 2003, pp. 468-479. [Online]. Available:

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing Systems
December 2023
80772 pages

Publisher

Curran Associates Inc.

Red Hook, NY, United States

Publication History

Published: 30 May 2024

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 23 Dec 2024

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media