1 Correction to: Neural Computing and Applications https://doi.org/10.1007/s00521-021-06282-2

Unfortunately, the article was published with some errors in Tables 1, 3, 5, and 7 and in the captions of Figures 1, 2, and 3 in the online version of the article.

The correct tables and figures are given (Tables 1, 3, 5, and 7 and Figs. 1, 2, and 3).

Table 1 Classification performance, averaged across five runs, of the different approaches on the Stanford cars [11] and FGVC-Aircraft [12] datasets
Table 3 Classification performance, averaged across five runs, making use of different backbones on the Stanford cars [11] and FGVC-Aircraft [12] datasets
Table 5 Classification performance, averaged across five runs, of the different approaches on the EgoFoodPlaces dataset [15]
Table 7 Classification performance, averaged across five runs, of the baseline method and the proposed training scheme when we randomly hid some regions on the test images
Fig. 1
figure 1

Workflow of our alternative training scheme, which 1 gets a new mini-batch of input images, 2 applies a visual explanation technique to generate the heat maps, 3 occludes the regions highlighted in the previous step, and 4 trains the CNN classifier

Fig. 2
figure 2

a Input images from the Stanford cars (top) and FGVC-Aircraft (bottom) datasets, b heat maps generated by Grad-CAM for the baseline FT-ResNet50, and heat maps generated by Grad-CAM for the model trained with the proposed training scheme using c 0-occlusion, d R-occlusion, and e 1-occlusion

Fig. 3
figure 3

a Input images, b heat maps generated by Grad-CAM for the baseline FT-ResNet50, and c heat maps generated by Grad-CAM for the model trained with the proposed training scheme (0-occlusion)