Abstract
The MNIST dataset of handwritten digits has been widely used for validating the effectiveness and efficiency of machine learning methods. Although this dataset was primarily used for classification and results of very high accuracy (99.3%+) on it have been obtained, its important application of regression is not directly applicable, thus substantially deteriorates its usefulness and the development of regression methods for such types of data. In this paper, to allow MNIST to be usable for regression, we firstly apply its class/label with normal distribution thereby convert the original discrete class numbers into float ones. Modified Convolutional Neural Networks (CNN) is then applied to generate a regression model. Multiple experiments have been conducted in order to select optimal parameters and layer settings for this application. Experimental results suggest that, optimal outcome of mean-absolute-error (MAE) value can be obtained when ReLu function is adopted for the first layer with other layers activated by the softplus functions. In the proposed approach, two indicators of MAE and Log-Cosh loss have been applied to optimize the parameters and score the predictions. Experiments on 10-fold cross-validation demonstrate that, desired low values of MAE and Log-Cosh error respectively at 0.202 and 0.079 can be achieved. Furthermore, multiple values of standard deviation of the normal distribution have been applied to verify the applicability when data of label number at varied distributions is used. The experimental results suggest that a positive correlation exists between the adopted standard deviation and the loss value, that is, the higher concentration degree of data will contribute to the lower MAE value.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Grother, P.J.: NIST special database 19. Handprinted forms and characters database, National Institute of Standards and Technology (1995)
Matsugu, M., Mori, K., Mitari, Y., Kaneda, Y.: Subject independent facial expression recognition with robust face detection using a convolutional neural network. Neural Netw. 16(5–6), 555–559 (2003)
Cireşan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification (2012). arXiv preprint arXiv:1202.2745
Lawrence, S., Giles, C.L., Tsoi, A.C., Back, A.D.: Face recognition: a convolutional neural-network approach. IEEE Trans. Neural Netw. 8(1), 98–113 (1997)
Le Callet, P., Viard-Gaudin, C., Barba, D.: A convolutional neural network approach for objective video quality assessment. IEEE Trans. Neural Netw. 17(5), 1316–1327 (2006)
van den Oord, A., Dieleman, S., Schrauwenvan, B.: Deep content-based music recommendation. Curran Associates, Inc., pp. 2643–2651 (2013)
Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine learning, pp. 160–167. ACM (2008)
Pyrkov, T.V., Slipensky, K., Barg, M., Kondrashin, A., Zhurov, B., Zenin, A., Fedichev, P.O.: Extracting biological age from biomedical data via deep learning: too much of a good thing? Sci. Reports 8(1), 5210 (2018)
Zang, J., Wang, L., Liu, Z., Zhang, Q., Hua, G., Zheng, N.: Attention-based temporal weighted convolutional neural network for action recognition. In: IFIP International Conference on Artificial Intelligence Applications and Innovations, pp. 97–108. Springer, Cham (2018)
Wald, A.: Statistical decision functions (1950)
Deng, L., Yu, D.: Deep learning: methods and applications. Found. Trends\(^{\textregistered }\) in Signal Process. 7(3–4), 197–387 (2014)
LeCun, Y.: LeNet-5, convolutional neural networks (2015). http://yann.lecun.com/exdb/lenet, 20
Zhang, W.: Shift-invariant pattern recognition neural network and its optical architecture. In: Proceedings of Annual Conference of the Japan Society of Applied Physics (1988)
Zhang, W., Itoh, K., Tanida, J., Ichioka, Y.: Parallel distributed processing model with local space-invariant interconnections and its optical architecture. Appl. Opt. 29(32), 4790–4797 (1990)
McLachlan, G., Do, K.A., Ambroise, C.: Analyzing Microarray Gene Expression Data, vol. 422. Wiley, London (2005)
Keras backends. keras.io. Accessed 23 Feb 2018
Willmott, C.J., Matsuura, K.: Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim. Res. 30(1), 79–82 (2005)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2014). arXiv preprint arXiv:1412.6980
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 807–814 (2010)
Dugas, C., Bengio, Y., Bélisle, F., Nadeau, C., Garcia, R.: Incorporating second-order functional knowledge for better option pricing. In: Advances in Neural Information Processing Systems, pp. 472–478 (2001)
Iglovikov, V.I., Rakhlin, A., Kalinin, A.A., Shvets, A.A.: Paediatric Bone age assessment using deep convolutional neural networks. In: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, pp. 300–308. Springer, Cham (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, Z., Wu, S., Liu, C., Wu, S., Xiao, K. (2020). The Regression of MNIST Dataset Based on Convolutional Neural Network. In: Hassanien, A., Azar, A., Gaber, T., Bhatnagar, R., F. Tolba, M. (eds) The International Conference on Advanced Machine Learning Technologies and Applications (AMLTA2019). AMLTA 2019. Advances in Intelligent Systems and Computing, vol 921. Springer, Cham. https://doi.org/10.1007/978-3-030-14118-9_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-14118-9_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14117-2
Online ISBN: 978-3-030-14118-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)