More Web Proxy on the site http://driver.im/

research-article

Reproducibility in multiple instance learning: a case for algorithmic unit tests

AUTHORs:

Jim HoltAuthors Info & Claims

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing Systems

Article No.: 596, Pages 13530 - 13544

Published: 30 May 2024 Publication History

Abstract

Multiple Instance Learning (MIL) is a sub-domain of classification problems with positive and negative labels and a "bag" of inputs, where the label is positive if and only if a positive element is contained within the bag, and otherwise is negative. Training in this context requires associating the bag-wide label to instance-level information, and implicitly contains a causal assumption and asymmetry to the task (i.e., you can't swap the labels without changing the semantics). MIL problems occur in healthcare (one malignant cell indicates cancer), cyber security (one malicious executable makes an infected computer), and many other tasks. In this work, we examine five of the most prominent deep-MIL models and find that none of them respects the standard MIL assumption. They are able to learn anti-correlated instances, i.e., defaulting to "positive" labels until seeing a negative counter-example, which should not be possible for a correct MIL model. We suspect that enhancements and other works derived from these models will share the same issue. In any context in which these models are being used, this creates the potential for learning incorrect models, which creates risk of operational failure. We identify and demonstrate this problem via a proposed "algorithmic unit test", where we create synthetic datasets that can be solved by a MIL respecting model, and which clearly reveal learning that violates MIL assumptions. The five evaluated methods each fail one or more of these tests. This provides a model-agnostic way to identify violations of modeling assumptions, which we hope will be useful for future development and evaluation of MIL models.

Supplementary Material

Additional material (3666122.3666718_supp.pdf)

Supplemental material.

Download
236.57 KB

References

[1]

G. D. Poore, E. Kopylova, Q. Zhu, C. Carpenter, S. Fraraccio, S. Wandro, T. Kosciolek, S. Janssen, J. Metcalf, S. J. Song, J. Kanbar, S. Miller-Montgomery, R. Heaton, R. Mckay, S. P. Patel, A. D. Swafford, and R. Knight, "Microbiome analyses of blood and tissues suggest cancer diagnostic approach," Nature, vol. 579, no. 7800, pp. 567-574, Mar. 2020. [Online]. Available:

[2]

A. Gihawi, Y. Ge, J. Lu, D. Puiu, A. Xu, C. S. Cooper, D. S. Brewer, M. Pertea, and S. L. Salzberg, "Major data analysis errors invalidate cancer microbiome findings," mBio, Oct. 2023. [Online]. Available:

[3]

G. Varoquaux and V. Cheplygina, "Machine learning for medical imaging: methodological failures and recommendations for the future," npj Digital Medicine, vol. 5, no. 1, Apr. 2022. [Online]. Available:

[4]

L. Wynants, B. V. Calster, G. S. Collins, R. D. Riley, G. Heinze, E. Schuit, E. Albu, B. Arshi, V. Bellou, M. M. J. Bonten, D. L. Dahly, J. A. Damen, T. P. A. Debray, V. M. T. de Jong, M. D. Vos, P. Dhiman, J. Ensor, S. Gao, M. C. Haller, M. O. Harhay, L. Henckaerts, P. Heus, J. Hoogland, M. Hudda, K. Jenniskens, M. Kammer, N. Kreuzberger, A. Lohmann, B. Levis, K. Luijken, J. Ma, G. P. Martin, D. J. McLernon, C. L. A. Navarro, J. B. Reitsma, J. C. Sergeant, C. Shi, N. Skoetz, L. J. M. Smits, K. I. E. Snell, M. Sperrin, R. Spijker, E. W. Steyerberg, T. Takada, I. Tzoulaki, S. M. J. van Kuijk, B. C. T. van Bussel, I. C. C. van der Horst, K. Reeve, F. S. van Royen, J. Y. Verbakel, C. Wallisch, J. Wilkinson, R. Wolff, L. Hooft, K. G. M. Moons, and M. van Smeden, "Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal," BMJ, p. m1328, Apr. 2020. [Online]. Available:

[5]

W. Fleshman, E. Raff, J. Sylvester, S. Forsyth, and M. McLean, "Non-negative networks against adversarial attacks," 2019.

[6]

I. n. Íncer Romeo, M. Theodorides, S. Afroz, and D. Wagner, "Adversarially robust malware detection using monotonic classification," in Proceedings of the Fourth ACM International Workshop on Security and Privacy Analytics, ser. IWSPA '18. New York, NY, USA: Association for Computing Machinery, 2018, p. 54-63. [Online]. Available:

Digital Library

[7]

D. Lowd and C. Meek, "Good word attacks on statistical spam filters." in CEAS, 2005. [Online]. Available: http://dblp.uni-trier.de/db/conf/ceas/ceas2005.html#LowdM05

[8]

J. Stiborek, T. Pevný, and M. Rehák, "Multiple instance learning for malware classification," Expert Systems with Applications, vol. 93, pp. 346-357, mar 2018. [Online]. Available:

Digital Library

[9]

T. Pevny and M. Dedic, "Nested multiple instance learning in modelling of http network traffic," 2020.

[10]

Z. Jorgensen, Y. Zhou, and M. Inge, "A multiple instance learning strategy for combating good word attacks on spam filters," Journal of Machine Learning Research, vol. 9, no. 38, pp. 1115-1146, 2008. [Online]. Available: http://jmlr.org/papers/v9/jorgensen08a.html

Digital Library

[11]

L. Beyer, O. J. Hénaff, A. Kolesnikov, X. Zhai, and A. van den Oord, "Are we done with imagenet?" 2020.

[12]

B. Barz and J. Denzler, "Do we train on test data? purging CIFAR of near-duplicates," Journal of Imaging, vol. 6, no. 6, p. 41, jun 2020. [Online]. Available:

[13]

R. S. Geiger, D. Cope, J. Ip, M. Lotosh, A. Shah, J. Weng, and R. Tang, ""garbage in, garbage out" revisited: What do machine learning application papers report about human-labeled training data?" Quantitative Science Studies, vol. 2, no. 3, pp. 795-827, 2021. [Online]. Available:

[14]

A. F. Cooper, Y. Lu, J. Forde, and C. M. De Sa, "Hyperparameter optimization is deceiving us, and how to stop it," in Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, Eds., vol. 34. Curran Associates, Inc., 2021, pp. 3081-3095. [Online]. Available: https://proceedings.neurips.cc/paper_files/ paper/2021/file/17fafe5f6ce2f1904eb09d2e80a4cbf6-Paper.pdf

[15]

X. Bouthillier, P. Delaunay, M. Bronzi, A. Trofimov, B. Nichyporuk, J. Szeto, N. Mohammadi Sepahvand, E. Raff, K. Madan, V. Voleti, S. Ebrahimi Kahou, V. Michalski, T. Arbel, C. Pal, G. Varoquaux, and P. Vincent, "Accounting for variance in machine learning benchmarks," in Proceedings of Machine Learning and Systems, A. Smola, A. Dimakis, and I. Stoica, Eds., vol. 3, 2021, pp. 747-769. [Online]. Available: https://proceedings.mlsys.org/ paper_files/paper/2021/file/0184b0cd3cfb185989f858a1d9f5c1eb-Paper.pdf

[16]

R. Dror, G. Baumer, M. Bogomolov, and R. Reichart, "Replicability analysis for natural language processing: Testing significance with multiple datasets," Transactions of the Association for Computational Linguistics, vol. 5, pp. 471-486, 2017. [Online]. Available: https://aclanthology.org/Q17-1033

[17]

A. Benavoli, G. Corani, and F. Mangili, "Should we really use post-hoc tests based on mean-ranks?" Journal of Machine Learning Research, vol. 17, no. 5, pp. 1-10, 2016. [Online]. Available: http://jmlr.org/papers/v17/benavoli16a.html

Digital Library

[18]

J. Demšar, "Statistical comparisons of classifiers over multiple data sets," Journal of Machine Learning Research, vol. 7, no. 1, pp. 1-30, 2006. [Online]. Available: http://jmlr.org/papers/v7/demsar06a.html

Digital Library

[19]

K. Ahn, P. Jain, Z. Ji, S. Kale, P. Netrapalli, and G. I. Shamir, "Reproducibility in optimization: Theoretical framework and limits," in Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds., vol. 35. Curran Associates, Inc., 2022, pp. 18 022-18 033. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2022/file/7274ed909a312d4d869cc328ad1c5f04-Paper-Conference.pdf

[20]

E. Raff, "A step toward quantifying independently reproducible machine learning research," in Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, Eds., vol. 32. Curran Associates, Inc., 2019. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2019/file/c429429bf1f2af051f2021dc92a8ebea-Paper.pdf

[21]

X. Bouthillier, C. Laurent, and P. Vincent, "Unreproducible research is reproducible," in Proceedings of the 36th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, K. Chaudhuri and R. Salakhutdinov, Eds., vol. 97. PMLR, 09-15 Jun 2019, pp. 725-734. [Online]. Available: https://proceedings.mlr.press/v97/bouthillier19a.html

[22]

E. Raff and A. L. Farris, "A siren song of open source reproducibility, examples from machine learning," in Proceedings of the 2023 ACM Conference on Reproducibility and Replicability, ser. ACM REP '23. New York, NY, USA: Association for Computing Machinery, 2023, p. 115-120. [Online]. Available:

Digital Library

[23]

M. Ferrari Dacrema, P. Cremonesi, and D. Jannach, "Are we really making much progress? a worrying analysis of recent neural recommendation approaches," in Proceedings of the 13th ACM Conference on Recommender Systems, ser. RecSys '19. New York, NY, USA: Association for Computing Machinery, 2019, p. 101-109. [Online]. Available:

Digital Library

[24]

F. Lu, E. Raff, and J. Holt, "A coreset learning reality check," Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 7, pp. 8940-8948, Jun. 2023. [Online]. Available:

Digital Library

[25]

M.-A. Carbonneau, V. Cheplygina, E. Granger, and G. Gagnon, "Multiple instance learning: A survey of problem characteristics and applications," Pattern Recognition, vol. 77, pp. 329-353, May 2018. [Online]. Available:

Digital Library

[26]

D. Grahn, "mil-benchmarks: Standardized evaluation of deep multiple-instance learning techniques," 2021.

[27]

J. Foulds and E. Frank, "A review of multi-instance learning assumptions," The Knowledge Engineering Review, vol. 25, no. 1, pp. 1-25, mar 2010. [Online]. Available:

Digital Library

[28]

S. Andrews, I. Tsochantaridis, and T. Hofmann, "Support vector machines for multiple-instance learning," in Advances in Neural Information Processing Systems, S. Becker, S. Thrun, and K. Obermayer, Eds., vol. 15. MIT Press, 2002. [Online]. Available: https://proceedings. neurips.cc/paper_files/paper/2002/file/3e6260b81898beacda3d16db379ed329-Paper.pdf

[29]

S. Ray and M. Craven, "Supervised versus multiple instance learning," in Proceedings of the 22nd international conference on Machine learning - ICML '05. ACM Press, 2005. [Online]. Available:

Digital Library

[30]

Z.-H. Zhou and J.-M. Xu, "On the relation between multi-instance learning and semi-supervised learning," in Proceedings of the 24th International Conference on Machine Learning, ser. ICML '07. New York, NY, USA: Association for Computing Machinery, 2007, p. 1167-1174. [Online]. Available:

Digital Library

[31]

Z.-H. Zhou and J.-M. Xu, "On the relation between multi-instance learning and semi-supervised learning," in Proceedings of the 24th international conference on Machine learning. ACM, jun 2007. [Online]. Available:

Digital Library

[32]

O. L. Mangasarian and E. W. Wild, "Multiple instance classification via successive linear programming," Journal of Optimization Theory and Applications, vol. 137, no. 3, pp. 555-568, dec 2007. [Online]. Available:

Digital Library

[33]

Z.-H. Zhou and M.-L. Zhang, "Neural networks for multi-instance learning," in Proceedings of the International Conference on Intelligent Information Technology, Beijing, China. Citeseer, 2002, pp. 455-459.

[34]

X. Wang, Y. Yan, P. Tang, X. Bai, and W. Liu, "Revisiting multiple instance neural networks," Pattern Recognition, vol. 74, pp. 15-24, feb 2018. [Online]. Available:

Digital Library

[35]

M. Ilse, J. Tomczak, and M. Welling, "Attention-based deep multiple instance learning," in Proceedings of the 35th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, J. Dy and A. Krause, Eds., vol. 80. PMLR, 10-15 Jul 2018, pp. 2127-2136. [Online]. Available: https://proceedings.mlr.press/v80/ilse18a.html

[36]

M. Tu, J. Huang, X. He, and B. Zhou, "Multiple instance learning with graph neural networks," 2019.

[37]

Z. Shao, H. Bian, Y. Chen, Y. Wang, J. Zhang, X. Ji, and y. zhang, "Transmil: Transformer based correlated multiple instance learning for whole slide image classification," in Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, Eds., vol. 34. Curran Associates, Inc., 2021, pp. 2136-2147. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2021/file/10c272d06794d3e5785d5e7c5356e9ff-Paper.pdf

[38]

M. Widrich, B. Schäfl, M. Pavlović, H. Ramsauer, L. Gruber, M. Holzleitner, J. Brandstetter, G. K. Sandve, V. Greiff, S. Hochreiter, and G. Klambauer, "Modern hopfield networks and attention for immune repertoire classification," in Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds., vol. 33. Curran Associates, Inc., 2020, pp. 18 832-18 845. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2020/file/da4902cb0bc38210839714ebdcf0efc3-Paper.pdf

[39]

S. Pal, A. Valkanas, F. Regol, and M. Coates, "Bag graph: Multiple instance learning using bayesian graph neural networks," Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 7, pp. 7922-7930, Jun. 2022. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/20762

[40]

Y. Yan, X. Wang, X. Guo, J. Fang, W. Liu, and J. Huang, "Deep multi-instance learning with dynamic pooling," in Proceedings of The 10th Asian Conference on Machine Learning, ser. Proceedings of Machine Learning Research, J. Zhu and I. Takeuchi, Eds., vol. 95. PMLR, 14-16 Nov 2018, pp. 662-677. [Online]. Available: https://proceedings.mlr.press/v95/yan18a.html

[41]

Y. Lin, M. Moosaei, and H. Yang, "OutfitNet: Fashion outfit recommendation with attention-based multiple instance learning," in Proceedings of The Web Conference 2020. ACM, apr 2020. [Online]. Available:

Digital Library

[42]

X. Shi, F. Xing, Y. Xie, Z. Zhang, L. Cui, and L. Yang, "Loss-based attention for deep multiple instance learning," Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 04, pp. 5742-5749, apr2020. [Online]. Available:

[43]

K. Thandiackal, B. Chen, P. Pati, G. Jaume, D. F. K. Williamson, M. Gabrani, and O. Goksel, "Differentiable zooming for multiple instance learning on whole-slide images," in Lecture Notes in Computer Science. Springer Nature Switzerland, 2022, pp. 699-715. [Online]. Available:

Digital Library

[44]

W. Zhang, X. Zhang, h. deng, and M.-L. Zhang, "Multi-instance causal representation learning for instance label prediction and out-of-distribution generalization," in Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds., vol. 35. Curran Associates, Inc., 2022, pp. 34940-34953. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2022/file/e261e92e1cfb820da930ad8c38d0aead-Paper-Conference.pdf

[45]

W. Zhang, L. Liu, and J. Li, Robust Multi-Instance Learning with Stable Instances. IOS Press, 2020, p. 1682-1689, citation Key: zhangRobustMultiInstanceLearning2020. [Online]. Available: https://ebooks.iospress.nl/doi/10.3233/FAIA200280

[46]

T. G. Dietterich, R. H. Lathrop, and T. Lozano-Pérez, "Solving the multiple instance problem with axis-parallel rectangles," Artificial Intelligence, vol. 89, no. 1-2, pp. 31-71, jan 1997. [Online]. Available:

Digital Library

[47]

N. Weidmann, E. Frank, and B. Pfahringer, "A two-level learning method for generalized multi-instance problems," in Machine Learning: ECML 2003. Springer Berlin Heidelberg, 2003, pp. 468-479. [Online]. Available:

Digital Library

Recommendations

Multiple-instance active learning
NIPS'07: Proceedings of the 20th International Conference on Neural Information Processing Systems

We present a framework for active learning in the multiple-instance (MI) setting. In an MI learning problem, instances are naturally organized into bags and it is the bags, instead of individual instances, that are labeled for training. MI learners ...
Multiple instance learning with bag dissimilarities

Multiple instance learning (MIL) is concerned with learning from sets (bags) of objects (instances), where the individual instance labels are ambiguous. In this setting, supervised learning cannot be applied directly. Often, specialized MIL methods ...
Multiple instance learning

The characteristics specific of MIL problems are formally identified and described.MIL methods and applications are reviewed in the light of the problem characteristics.Comparative experiments show the impact of problem characteristics on 16 reference ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing Systems

December 2023

80772 pages

Copyright © 2023 Neural Information Processing Systems Foundation, Inc.

Publisher

Curran Associates Inc.

Red Hook, NY, United States

Publication History

Published: 30 May 2024

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 23 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Table of Contents