Abstract
Adaptive bandwidth kernel density estimators (AB-KDEs) have received attention from the academic community due to an analytical promise of increased performance over classical estimators. However, the field is fragmented, and there exists no comprehensive comparison of the existing state-of-the-art AB-KDEs. We provide a comparison of some state-of-the-art and classical AB-KDE methods as well as a computational framework. We also present a novel implementation of a full principal axes rotation hyper-ellipsoid variant of the k-Nearest Neighbours algorithm and a Gaussian extension to K-NN. The extensive experimental results show the fixed bandwidth rule-of-thumb methods achieve satisfactory results. Further, the balloon estimators are shown to be superior in the higher dimensional spaces, with higher modes or data on non-linear manifolds. The sample point estimators show utility when data are scarce in low dimensions. The empirical results show that our full rotation hyper-ellipsoid estimator and our Gaussian K-NN are state-of-the-art and will have a significant positive impact on data analysis algorithms. Especially algorithms which depend upon underlying density estimates on “complex” higher-dimensional data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abramson, I.S.: On bandwidth variation in kernel estimates-a square root law. Ann. Stat. 10(4), 1217–1223 (1982)
Barnard, E.: Maximum leave-one-out likelihood for kernel density estimation. In: Proceedings of the Twenty-First Annual Symposium of the Pattern Recognition Association of South Africa (2010)
Bithell, J.F.: An application of density estimation to geographical epidemiology. Stat. Med. 9(6), 691–701 (1990)
Boltz, S., Debreuve, E., Barlaud, M.: High-dimensional statistical measure for region-of-interest tracking. IEEE Trans. Image Process. 18(6), 1266–1283 (2009)
Botev, Z.I., Grotowski, J.F., Kroese, D.P., et al.: Kernel density estimation via diffusion. Ann. Stat. 38(5), 2916–2957 (2010)
Breiman, L., Meisel, W., Purcell, E.: Variable kernel estimates of multivariate densities. Technometrics 19(2), 135–144 (1977)
Budka, M., Gabrys, B., Musial, K.: On accuracy of pdf divergence estimators and their applicability to representative data sampling. Entropy 13(7), 1229–1266 (2011)
Comaniciu, D., Ramesh, V., Meer, P.: The variable bandwidth mean shift and data-driven scale selection. In: Eighth IEEE International Conference on Computer Vision. ICCV 2001. Proceedings, vol. 1, pp. 438–445. IEEE (2001)
DasGupta, A.: Some results on the curse of dimensionality and sample size recommendations. Calcutta Stat. Assoc. Bull. 50(3–4), 157–178 (2000)
Domeniconi, C., Gunopulos, D.: Locally adaptive techniques for pattern classification. In: Encyclopedia of Data Warehousing and Mining, 2nd edn., pp. 1170–1175. IGI Global, Hershey (2009)
Duong, T., Hazelton, M.L.: Cross-validation bandwidth matrices for multivariate kernel density estimation. Scand. J. Stat. 32(3), 485–506 (2005)
Farmen, M., Marron, J.S.: An assessment of finite sample performance of adaptive methods in density estimation. Comput. Stat. Data Anal. 30(2), 143–168 (1999)
Hall, P.: Large sample optimality of least squares cross-validation in density estimation. Ann. Stat. 11(4), 1156–1174 (1983)
Hall, P., Huber, C., Owen, A., Coventry, A.: Asymptotically optimal balloon density estimates. J. Multivariate Anal. 51(2), 352–371 (1994)
Hansen, B.E.: Lecture notes on nonparametrics. Lect. Notes (2009). (Report) University of Wisconsin
Kung, Y.H., Lin, P.S., Kao, C.H.: An optimal k-nearest neighbor for density estimation. Stat. Prob. Lett. 82(10), 1786–1791 (2012)
de Lima, M.S., Atuncar, G.S.: A Bayesian method to estimate the optimal bandwidth for multivariate kernel estimator. J. Nonparametric Stat. 23(1), 137–148 (2011)
Loftsgaarden, D.O., Quesenberry, C.P., et al.: A nonparametric estimate of a multivariate density function. Ann. Math. Stat. 36(3), 1049–1051 (1965)
Marshall, J.C., Hazelton, M.L.: Boundary kernels for adaptive density estimators on regions with irregular boundaries. J. Multivariate Anal. 101(4), 949–963 (2010)
Mittal, A., Paragios, N.: Motion-based background subtraction using adaptive kernel density estimation. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2004, vol. 2, p. 2. IEEE (2004)
Moshtagh, N.: Minimum volume enclosing ellipsoid. Convex Optim. 111, 112 (2005)
Sain, S.R.: Multivariate locally adaptive density estimation. Comput. Stat. Data Anal. 39(2), 165–186 (2002)
Salgado-Ugarte, I.H., Perez-Hernandez, M.A., et al.: Exploring the use of variable bandwidth kernel density estimators. Stata J. 3(2), 133–147 (2003)
Scott, D.W.: Feasibility of multivariate density estimates. Biometrika 78(1), 197–205 (1991)
Scott, D.W., Sain, S.R.: Multidimensional density estimation. Handb. Stat. 24, 229–261 (2005)
Shi, X.: Selection of bandwidth type and adjustment side in kernel density estimation over inhomogeneous backgrounds. Int. J. Geogr. Inf. Sci. 24(5), 643–660 (2010)
Sibolla, B.H., Coetzee, S., Van Zyl, T.L.: A framework for visual analytics of spatio-temporal sensor observations from data streams. ISPRS Int. J. Geo Inf. 7(12), 475 (2018)
Silverman, B.W.: Density Estimation for Statistics and Data Analysis, vol. 26. CRC Press, Boca Raton (1986)
Terrell, G.R., Scott, D.W.: Variable kernel density estimation. Ann. Stat. 20(3), 1236–1265 (1992)
van der Walt, C.M., Barnard, E.: Variable kernel density estimation in high-dimensional feature spaces. Association for the Advancement of Artificial (2017)
Wand, M., Jones, M.: Comparison of smoothing parameterizations in bivariate kernel density estimation. J. Am. Stat. Assoc. 88(422), 520–528 (1993)
Wu, T.J., Chen, C.F., Chen, H.Y.: A variable bandwidth selector in multivariate kernel density estimation. Stat. Prob. Lett. 77(4), 462–467 (2007)
Zeng, G.: A comparison study of computational methods of Kolmogorov-Smirnov statistic in credit scoring. Commun. Stat. Simul. Comput. 46(10), 7744–7760 (2017)
Zhang, L., Lin, J., Karim, R.: Adaptive kernel density-based anomaly detection for nonlinear systems. Knowl. Based Syst. 139, 50–63 (2018)
Zhang, X., King, M., Hyndman, R.: A Bayesian approach to bandwidth selection for multivariate kernel density estimation. Comput. Stat. Data Anal. 50(11), 3009–3031 (2006)
Zhong, B., Liu, S., Yao, H.: Local spatial co-occurrence for background subtraction via adaptive binned kernel estimation. In: Zha, H., Taniguchi, R., Maybank, S. (eds.) ACCV 2009. LNCS, vol. 5996, pp. 152–161. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12297-2_15
Zougab, N., Adjabi, S., Kokonendji, C.C.: Bayesian estimation of adaptive bandwidth matrices in multivariate kernel density estimation. Comput. Stat. Data Anal. 75, 28–38 (2014)
van Zyl, T.L.: Machine learning on geospatial big data. In: Big Data: Techniques and Technologies in Geoinformatics, p. 133. CRC Press, Boca Raton (2014)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
van Zyl, T.L. (2022). Full Rotation Hyper-ellipsoid Multivariate Adaptive Bandwidth Kernel Density Estimator. In: Jembere, E., Gerber, A.J., Viriri, S., Pillay, A. (eds) Artificial Intelligence Research. SACAIR 2021. Communications in Computer and Information Science, vol 1551. Springer, Cham. https://doi.org/10.1007/978-3-030-95070-5_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-95070-5_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-95069-9
Online ISBN: 978-3-030-95070-5
eBook Packages: Computer ScienceComputer Science (R0)