Abstract
Breast cancer is the most common cancer in women, and hundreds of thousands of unnecessary biopsies are done around the world at a tremendous cost. It is crucial to reduce the rate of biopsies that turn out to be benign tissue. In this study, we build deep neural networks (DNNs) to classify biopsied lesions as being either malignant or benign, with the goal of using these networks as second readers serving radiologists to further reduce the number of false-positive findings. We enhance the performance of DNNs that are trained to learn from small image patches by integrating global context provided in the form of saliency maps learned from the entire image into their reasoning, similar to how radiologists consider global context when evaluating areas of interest. Our experiments are conducted on a dataset of 229,426 screening mammography examinations from 141,473 patients. We achieve an AUC of 0.8 on a test set consisting of 464 benign and 136 malignant lesions.
Similar content being viewed by others
References
American Cancer Society: Cancer Facts and Figures. Atlanta, Ga: American Cancer Society, 2020.
US Preventive Services Task Force: Medication Use to Reduce Risk of Breast Cancer: US Preventive Services Task Force Recommendation Statement. JAMA (2019), 322:857–867.
Lee, C. S., Monticciolo, D. L., and Moy, L. Screening guidelines update for average-risk and high-risk women. AJR Am J Roentgenol (2020), 214:316–323.
Monticciolo, D. L., Newell, M. S., Hendrick, R. E., Helvie, M. A., Moy, L., Monsees, B., Kopans, D., Eby, P. R., and Sickles, E. A. Breast cancer screening for average-risk women: Recommendations from the acr commission on breast imaging. J Am Coll Radiol (2017), 14:1137–1143.
Oeffinger, K. C., Fontham, E. T. H., Etzioni, R., Herzig, A., Michaelson, J. S., Shih, Y.-C. T., Walter, L. C., Church, T. R., Flowers, C. R., LaMonte, S. J., Wolf, A. M. D., DeSantis, C., Lortet-Tieulent, J., Andrews, K., Manassaram-Baptiste, D., Saslow, D., Smith, R. A., Brawley, O. W., and Wender, R. Breast Cancer Screening for Women at Average Risk: 2015 Guideline Update From the American Cancer Society. JAMA (2015), 314:1599–1614.
Lehman, C. D., Arao, R. F., Sprague, B. L., Lee, J. M., Buist, D. S., Kerlikowske, K., Henderson, L. M., Onega, T., Tosteson, A. N., Rauscher, G. H., et al. National performance benchmarks for modern screening digital mammography: update from the breast cancer surveillance consortium. Radiology (2017), 283:49–58.
Ong, M.-S., and Mandl, K. D. National expenditure for false-positive mammograms and breast cancer overdiagnoses estimated at $4 billion a year. Health affairs (2015), 34:576–583.
Vlahiotis, A., Griffin, B., Stavros, A. T., and Margolis, J. Analysis of utilization patterns and associated costs of the breast imaging and diagnostic procedures after screening mammography. ClinicoEconomics and outcomes research: CEOR (2018), 10:157.
Chubak, J., Boudreau, D. M., Fishman, P. A., and Elmore, J. G. Cost of breast-related care in the year following false positive screening mammograms. Medical care (2010), 48:815.
Fenton, J. J., Taplin, S. H., Carney, P. A., Abraham, L., Sickles, E. A., D’Orsi, C., Berns, E. A., Cutter, G., Hendrick, R. E., Barlow, W. E., et al. Influence of computer-aided detection on performance of screening mammography. N Engl J Med (2007), 356:1399–1409.
Lehman, C. D., Wellman, R. D., Buist, D. S., Kerlikowske, K., Tosteson, A. N., and Miglioretti, D. L. Diagnostic accuracy of digital screening mammography with and without computer-aided detection. JAMA Intern Med (2015), 293:1828–1837.
Aboutalib, S. S., Mohamed, A. A., Berg, W. A., Zuley, M. L., Sumkin, J. H., and Wu, S. Deep learning to distinguish recalled but benign mammography images in breast cancer screening. Clin Cancer Res (2018), 24:5902–5909.
Kim, E.-K., Kim, H.-E., Han, K., Kang, B. J., Sohn, Y.-M., Woo, O. H., and Lee, C. W. Applying data-driven imaging biomarker in mammography for breast cancer screening: preliminary study. Sci Rep (2018), 8:1–8.
Kyono, T., Gilbert, F. J., and van der Schaar, M. Mammo: A deep learning solution for facilitating radiologist-machine collaboration in breast cancer diagnosis. arXiv:1811.02661 (2018).
McKinney, S. M., Sieniek, M., Godbole, V., Godwin, J., Antropova, N., Ashrafian, H., Back, T., Chesus, M., Corrado, G. C., Darzi, A., et al. International evaluation of an ai system for breast cancer screening. Nature (2020), 577:89–94.
Shen, Y., Wu, N., Phang, J., Park, J., Liu, K., Tyagi, S., Heacock, L., Kim, S., Moy, L., Cho, K., et al. An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization. arXiv:2002.07613 (2020).
Wu, N., Phang, J., Park, J., Shen, Y., Huang, Z., Zorin, M., Jastrzebski, S., Févry, T., Katsnelson, J., Kim, E., et al. Deep neural networks improve radiologists’ performance in breast cancer screening. IEEE Trans Med Imaging (2019), 39:1184–1194.
Zhu, W., Lou, Q., Vang, Y. S., and Xie, X. Deep multi-instance networks with sparse label assignment for whole mammogram classification. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention (2017), pp. 603–611.
Cohen, E. O., Tso, H. H., and Leung, J. W. Multiple bilateral circumscribed breast masses detected at imaging: Review of evidence for management recommendations. AJR Am J Roentgenol (2020), 214:276–281.
Leung, J. W., and Sickles, E. A. Multiple bilateral masses detected on screening mammography: assessment of need for recall imaging. AJR Am J Roentgenol (2000), 175:23–29.
Xi, P., Shu, C., and Goubran, R. Abnormality detection in mammography using deep convolutional neural networks. In Proceedings IEEE International Symposium on Medical Measurements and Applications (2018), pp. 1–6.
Agarwal, R., Diaz, O., Lladó, X., Yap, M. H., and Martí, R. Automatic mass detection in mammograms using deep convolutional neural networks. J Med Imaging (2019), 6:031409.
Liu, Y., Zhang, F., Zhang, Q., Wang, S., Wang, Y., and Yu, Y. Cross-view correspondence reasoning based on bipartite graph convolutional network for mammogram mass detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020), pp. 3812–3822.
Ribli, D., Horváth, A., Unger, Z., Pollner, P., and Csabai, I. Detecting and classifying lesions in mammograms with deep learning. Sci Rep (2018), 8:1–7.
Samala, R. K., Chan, H.-P., Hadjiiski, L., Helvie, M. A., Wei, J., and Cha, K. Mass detection in digital breast tomosynthesis: Deep convolutional neural network with transfer learning from mammography. Med physics (2016), 43:6654–6666.
He, K., Gkioxari, G., Dollár, P., and Girshick, R. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision (2017), pp. 2961–2969.
Pereira, S. M. P., McCormack, V. A., Moss, S. M., and dos Santos Silva, I. The spatial distribution of radiodense breast tissue: a longitudinal study. Breast Cancer Res (2009), 11:R33.
Wei, J., Chan, H.-P., Wu, Y.-T., Zhou, C., Helvie, M. A., Tsodikov, A., Hadjiiski, L. M., and Sahiner, B. Association of computerized mammographic parenchymal pattern measure with breast cancer risk: a pilot case-control study. Radiology (2011), 260:42–49.
Conant, E. F., Barlow, W. E., Herschorn, S. D., Weaver, D. L., Beaber, E. F., Tosteson, A. N. A., Haas, J. S., Lowry, K. P., Stout, N. K., Trentham-Dietz, A., diFlorio Alexander, R. M., Li, C. I., Schnall, M. D., Onega, T., Sprague, B. L., and for the Population-based Research Optimizing Screening Through Personalized Regimen (PROSPR) Consortium. Association of Digital Breast Tomosynthesis vs Digital Mammography With Cancer Detection and Recall Rates by Age and Breast Density. JAMA Oncol (2019), 5:635–642.
Shen, Y., Wu, N., Phang, J., Park, J., Kim, G., Moy, L., Cho, K., and Geras, K. J. Globally-aware multiple instance classifier for breast cancer screening. In Proceedings of International Workshop on Machine Learning in Medical Imaging (2019), pp. 18–26.
Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K. Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 4700–4708.
Kingma, D. P., and Ba, J. Adam: A method for stochastic optimization. arXiv:1412.6980 (2014).
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al. Imagenet large scale visual recognition challenge. International journal of computer vision (2015), 115:211–252.
Bergstra, J., and Bengio, Y. Random search for hyper-parameter optimization. Proc Mach Learn Res (2012), 13:281–305.
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al. Pytorch: An imperative style, high-performance deep learning library. In Proceedings of the Neural Information Processing Systems Conference (2019), pp. 8026–8037.
He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 770–778.
Guan, Q., and Huang, Y. Multi-label chest X-ray image classification via category-wise residual attention learning. Pattern Recogn. Lett. 130 (2020), 130:259–266.
Hannun, A. Y., Rajpurkar, P., Haghpanahi, M., Tison, G. H., Bourn, C., Turakhia, M. P., and Ng, A. Y. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat Med (2019), 25:65.
Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., and Summers, R. M. Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 2097–2106.
He, K., Zhang, X., Ren, S., and Sun, J. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE International Conference on Computer Vision (2015), pp. 1026–1034.
Spanhol, F. A., Oliveira, L. S., Petitjean, C., and Heutte, L. Breast cancer histopathological image classification using convolutional neural networks. In Proceedings of the International Joint Conference on Neural Networks (2016), pp. 2560–2567.
Elter, M., Schulz-Wendtland, R., and Wittenberg, T. The prediction of breast cancer biopsy outcomes using two cad approaches that both emphasize an intelligible decision process. J Med Phys (2007), 34:4164–4172.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wu, N., Huang, Z., Shen, Y. et al. Reducing False-Positive Biopsies using Deep Neural Networks that Utilize both Local and Global Image Context of Screening Mammograms. J Digit Imaging 34, 1414–1423 (2021). https://doi.org/10.1007/s10278-021-00530-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10278-021-00530-6