Abstract
Indirect taxation is a significant source of income for any nation. Tax evasion hinders the progress of a nation. It causes a substantial loss to the revenue of a country. We design a model based on variational graph autoencoders and clustering to identify taxpayers who are evading indirect tax by providing false information in their tax returns. We derive six correlation parameters (features) and three ratio parameters from the data submitted by taxpayers in their returns. We derive four latent features from these nine features using variational graph autoencoder and cluster taxpayers using these four latent features. We identify taxpayers located at the boundary of each cluster by using kernel density estimation, which is further investigated to single out tax evaders. We applied our method to the iron and steel taxpayers data set provided by the Commercial Taxes Department, the government of Telangana, India.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cao, J., Jin, D., Yang, L., Dang, J.: Incorporating network structure with node contents for community detection on large networks using deep learning. Neurocomputing 297, 71–81 (2018)
Castellón González, P., Velásquez, J.D.: Characterization and detection of taxpayers with false invoices using data mining techniques. Expert Syst. Appl. 40(5), 1427–1436 (2013)
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3) (2009). https://doi.org/10.1145/1541880.1541882
Dani, S.: A research paper on an impact of goods and service tax (GST) on Indian economy. Bus. Econ. J. 7, 264 (2016). ISSN 2151–6219
Issa, H., Vasarhelyi, M.A.: Application of anomaly detection techniques to identify fraudulent refunds (2011)
Maione, C., Nelson, D.R., Barbosa, R.M.: Research on social data by means of cluster analysis. Appl. Comput. Inform. 15(2), 153–162 (2018). https://doi.org/10.1016/j.aci.2018.02.003
Pan, S., Hu, R., Long, G., Jiang, J., Yao, L., Zhang, C.: Adversarially regularized graph autoencoder for graph embedding. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI 2018, pp. 2609–2615. AAAI Press (2018)
Pu, Y., et al.: Variational autoencoder for deep learning of images, labels and captions. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS 2016, pp. 2360–2368. Curran Associates Inc., Red Hook (2016)
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)
de Roux, D., Perez, B., Moreno, A., Villamil, M.d.P., Figueroa, C.: Tax fraud detection for under-reporting declarations using an unsupervised machine learning approach. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2018, pp. 215–222. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3219819.3219878
Salha, G., Hennequin, R., Tran, V., Vazirgiannis, M.: A degeneracy framework for scalable graph autoencoders. In: IJCAI (2019)
Tian, F., Gao, B., Cui, Q., Chen, E., Liu, T.: Learning deep representations for graph clustering. In: Brodley, C.E., Stone, P. (eds.) Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, Québec City, Québec, Canada,, 27–31 July 2014, pp. 1293–1299. AAAI Press (2014). http://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/view/8527
Tran, L.T.: The l1 convergence of kernel density estimates under dependence. Canadian J. Stat/La Revue Canadienne de Statistique 17(2), 197–208 (1989). http://www.jstor.org/stable/3314848
Tran, P.V.: Learning to make predictions on graphs with autoencoders. In: 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), pp. 237–245 (2018). https://doi.org/10.1109/DSAA.2018.00034
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)
Şahin, Y., Duman, E.: Detecting credit card fraud by ANN and logistic regression. In: 2011 International Symposium on Innovations in Intelligent Systems and Applications, pp. 315–319 (2011)
Acknowledgment
We express our sincere gratitude to the Telangana state government, India, for sharing the commercial tax data set, which is used in this work.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Mehta, P., Kumar, S., Kumar, R., Babu, C.S. (2021). Demystifying Tax Evasion Using Variational Graph Autoencoders. In: Kö, A., Francesconi, E., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Electronic Government and the Information Systems Perspective. EGOVIS 2021. Lecture Notes in Computer Science(), vol 12926. Springer, Cham. https://doi.org/10.1007/978-3-030-86611-2_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-86611-2_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86610-5
Online ISBN: 978-3-030-86611-2
eBook Packages: Computer ScienceComputer Science (R0)