Abstract
The modified (or second version) gamma kernel of Chen [Probability density function estimation using gamma kernels, Annals of the Institute of Statistical Mathematics 52 (2000), pp. 471–480] should not be automatically preferred to the standard (or first version) gamma kernel, especially for univariate convex densities with a pole at the origin. In the multivariate case, multiple combined gamma kernels, defined as a product of univariate standard and modified ones, are here introduced for nonparametric and semiparametric smoothing of unknown orthant densities with support \([0,\infty )^d\). Asymptotical properties of these multivariate associated kernel estimators are established. Bayesian estimation of adaptive bandwidth vectors using multiple pure combined gamma smoothers, and in semiparametric setup, are exactly derived under the usual quadratic function. The simulation results and four illustrations on real datasets reveal very interesting advantages of the proposed combined approach for nonparametric smoothing, compare to both pure standard and pure modified gamma kernel versions, and under integrated squared error and average log-likelihood criteria.
Similar content being viewed by others
References
Arshad MZ, Iqbal MZ, Al Mutairi A (2021) A comprehensive review of datasets for statistical research in probability and quality control. J Math Comput Sci 11:3663–3728
Azzalini A, Bowman AW (1990) A look at some data on the old faithful geyser. J Roy Statist Soc Ser C 39:357–365
Belaid N, Adjabi S, Kokonendji CC, Zougab N (2016) Bayesian local bandwidth selector in multivariate associated kernel estimator for joint probability mass functions. J Statist Comput Simul 86:3667–3681
Belaid N, Adjabi S, Kokonendji CC, Zougab N (2018) Bayesian adaptive bandwidth selector for multivariate discrete kernel estimator. Commun Statist Theor Methods 47:2988–3001
Bouezmarni T, Rombouts JV (2010) Nonparametric density estimation for multivariate bounded data. J Statist Plann Inference 140:139–152
Brewer MJ (2000) A Bayesian model for local smoothing in kernel density estimation. Statist Comput 10:299–309
Chen SX (1999) A beta kernel estimation for density functions. Comput Statist Data Anal 31:131–145
Chen SX (2000) Probability density function estimation using gamma kernels. Ann Inst Statist Math 52:471–480
Duong T, Hazelton ML (2005) Convergence rates for unconstrained bandwidth matrix selectors in multivariate kernel density estimation. J Multiv Anal 93:417–433
Erçelik E, Nadar N (2021) A new kernel estimator based on scaled inverse Chi-squared density function. Amer J Math Manag Sci 40:306–319
Filippone M, Sanguinetti G (2011) Approximate inference of the bandwidth in multivariate kernel density estimation. Comput Statist Data Anal 55:3104–3122
Funke B, Kawka R (2015) Nonparametric density estimation for multivariate bounded data using two non-negative multiplicative bias correction methods. Comput Statist Data Anal 92:148–162
Harfouche L, Zougab N, Adjabi S (2020) Multivariate generalised gamma kernel density estimators and application to non-negative data. Intern J Comput Sci Math 11:137–157
Hirukawa M, Sakudo M (2014) Nonnegative bias reduction methods for density estimation using asymmetric kernels. Comput Statist Data Anal 75:112–123
Hirukawa M, Sakudo M (2015) Family of the generalised gamma kernels: a generator of asymmetric kernels for nonnegative data. J Nonparametr Statist 27:41–63
Hjort NL, Glad IK (1995) Nonparametric density estimation with a parametric start. Ann Statist 23:882–904
Igarashi G, Kakizawa Y (2014) Re-formulation of inverse Gaussian, reciprocal inverse Gaussian, and Birnbaum-Saunders kernel estimators. Statist Probab Lett 84:235–246
Igarashi G, Kakizawa Y (2015) Bias correction for some asymmetric kernel estimators. J Statist Plann Inference 159:37–63
Jin X, Kawczak J (2003) Birnbaum-Saunders and lognormal kernel estimators for modelling durations in high frequency financial data. Ann Econ Finance 4:103–124
Kakizawa Y (2022) Multivariate elliptical-based Birnbaum-Saunders kernel density estimation for nonnegative data. J Multiv Anal 187:104834
Kokonendji CC, Puig P (2018) Fisher dispersion index for multivariate count distributions: a review and a new proposal. J Multiv Anal 165:180–193
Kokonendji CC, Senga Kiessé T, Balakrishnan N (2009) Semiparametric estimation for count data through weighted distributions. J Statist Plann Inference 139:3625–3638
Kokonendji CC, Somé SM (2018) On multivariate associated kernels to estimate general density functions. J Korean Statist Soc 47(2018):112–126
Kokonendji CC, Somé SM (2021) Bayesian bandwidths in semiparametric modelling for nonnegative orthant data with diagnostics. Stats 4:162–183
Kokonendji CC, Touré AY, Sawadogo A (2020) Relative variation indexes for multivariate continuous distributions on \([0,\infty )^k\) and extensions. AStA Adv Statist Anal 104:285–307
Libengué Dobélé-Kpoka FGB, Kokonendji CC (2017) The mode-dispersion approach for constructing continuous associated kernels. Afr Statist 12:1417–1446
Malec P, Schienle M (2014) Nonparametric kernel density estimation near the boundary. Comput Statist Data Anal 72:57–76
Marchant C, Bertin K, Leiva V, Saulo H (2013) Generalized Birnbaum-Saunders kernel density estimators and an analysis of financial data. Comput Statist Data Anal 63:1–15
Michele DN, Padgett WJ (2006) A bootstrap control chart for Weibull percentiles. Qual Reliab Engng Int 22:141–151
Nadarajah S, Kotz S (2007) On the alternative to the Weibull function. Eng Frac Mech 74:577–579
Ouimet F, Tolosana-Delgado R (2022) Asymptotic properties of Dirichlet kernel density estimators. J Multiv Anal 187:104832
R Core Team, R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria (2021). Available online: http://cran.r-project.org/
Sain SR (2002) Multivariate locally adaptive density estimation. Comput Statist Data Anal 39:165–186
Scaillet O (2004) Density estimation using inverse and reciprocal inverse Gaussian kernels. J Nonparametr Statist 16:217–226
Senga T, Kiéssé T, Mizère D (2004) Density estimation using inverse and reciprocal inverse Gaussian kernels. J Nonparametr Statist 16:217–226
Somé SM (2022) Bayesian selector of adaptive bandwidth for gamma kernel density estimator on \([0,\infty )\). Statist Simul Comput press Commun https://doi.org/10.1080/03610918.2020.1828921
Somé SM, Kokonendji CC (2022) Bayesian selector of adaptive bandwidth for multivariate gamma kernel estimator on \([0,\infty )^d\). J Appl Statist 49:1692–1713
Touré AY, Dossou-Gbété S, Kokonendji CC (2020) Asymptotic normality of the test statistics for the unified relative dispersion and relative variation indexes. J Appl Statist 47:2479–2491
Zhang X, King ML, Hyndman RJ (2006) A Bayesian approach to bandwidth selection for multivariate kernel density estimation. Comput Statist Data Anal 50:3009–3031
Zhang S (2010) A note on the performance of the gamma kernel estimators at the boundary. Statist Probab Lett 80:548–557
Ziane Y, Zougab N, Adjabi S (2015) Adaptive Bayesian bandwidth selection in asymmetric kernel density estimation for nonnegative heavy-tailed data. J Appl Statist 42:1645–1658
Ziane Y, Zougab N, Adjabi S (2018) Birnbaum-Saunders power-exponential kernel density estimation and Bayes local bandwidth selection for nonnegative heavy tailed data. Comput Statist 33:299–318
Zougab N, Adjabi S, Kokonendji CC (2014) Bayesian estimation of adaptive bandwidth matrices in multivariate kernel density estimation. Comput Statist Data Anal 75:28–38
Zougab N, Harfouche L, Ziane Y, Adjabi S (2018) Multivariate generalized Birnbaum-Saunders kernel density estimators. Commun Statist Theory Methods 47:4534–4555
Acknowledgements
We are specially grateful to an Associate Editor for his valuable comments that significantly improved the paper. For the second coauthor, this work is supported by the EIPHI Graduate School (contract ANR-17-EURE-0002). The first two authors dedicate this paper to Professor Blaise Somé for his 70th birthday.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: proofs of propositions
Appendix: proofs of propositions
Proof of Proposition 2
Since one has
it is enough to calculate \({\mathbb {E}}[{\widehat{w}}_n({\varvec{x}})]\) and \(\textrm{var}[{\widehat{w}}_n({\varvec{x}})]\) using \({\widehat{w}}_n({\varvec{x}})=n^{-1}\sum _{i=1}^n {\textbf{G}}_{{\varvec{x}},{\textbf{h}},\ell }({\textbf{X}}_i)/p_{d}({\textbf{X}}_i;\widehat{\varvec{\theta }}_n)\) for all \({\varvec{x}}\in {\mathbb {T}}_d^+\) and \({\textbf{G}}_{{\varvec{x}},{\textbf{h}},\ell }=\left( \prod _{s=1}^{d-\ell }G_{x_{s},h_{s}}\right) \left( \prod _{r=1}^{\ell }G_{\rho (x_{r};h_{r}),h_{r}}\right) \) from (6). Indeed, one successively has
which leads to the result of \(\textrm{Bias}[{\widehat{f}}_n({\varvec{x}})]\).
About the variance term, f being bounded leads to \({\mathbb {E}}\left[ {\textbf{G}}_{{\varvec{x}},{\textbf{h}},\ell }({\textbf{X}}_{j})\right] =O(1)\). Also, we denote by \(\nabla f({\varvec{x}})\) and \({\mathcal {H}}f({\varvec{x}})\) the gradient vector and the corresponding Hessian matrix of the function f at \({\varvec{x}}\), respectively. It successively follows:
and the desired result of \(\textrm{var}[{\widehat{f}}_n({\varvec{x}})]\) is therefore deduced. \(\square \)
Proof of Proposition 3
(i) Let us represent \(\pi ({\textbf{h}}_{i}\mid {\textbf{X}}_{i})\) of (14) as the ratio of \(N({\textbf{h}}_{i}\mid {\textbf{X}}_{i}):={\widehat{f}}_{n,{\textbf{h}}_i,-i}({\textbf{X}}_i) \pi ({\textbf{h}}_{i})\) and \(\int _{[0, \infty )^{d}}N({\textbf{h}}_{i}\mid {\textbf{X}}_{i})d{\textbf{h}}_{i}\). From (13) and (16) the numerator is first equal to
From (3), consider the following partition \({\mathbb {I}}_{{\textbf{X}}_i}\) and \({\mathbb {I}}_{{\textbf{X}}_i}^{c}\) of \(\{1,2,...,d\}\). For \(X_{ik} \in [0,2h_{ik})\) with \(k\in {\mathbb {I}}_{{\textbf{X}}_i}\), the function \(n\mapsto \left[ X_{ik}/2h_{ik}(n)\right] ^{2}\) is bounded and then there exists a constant \(\lambda _{ik}>0\) such that \((X_{ik}/2h_{ik})^{2} \rightarrow \lambda _{ik}\) as \(n\rightarrow \infty \); see (Chen 2000, pp. 474–475). Using successively (2) and (3) with the behavior \((X_{ik}/2h_{ik})^{2}\simeq \lambda _{ik}\) as \(n\rightarrow \infty \), the term of product on \({\mathbb {I}}_{{\textbf{X}}_i}\) in (17) can be expressed as follows
with \(A_{ijk}(\alpha ,\beta _k)= [ \Gamma (\lambda _{ik}+ \alpha +1) X_{jk}^{\lambda _{ik}}]/[\beta _{k}^{-\alpha }\Gamma (\lambda _{ik}+1)(X_{jk}+\beta _{k})^{\lambda _{ik}+\alpha +1}]\) and \(IG_{\lambda _{ik}+\alpha +1,X_{jk}+\beta _{k}}(h_{ik})\) comes from (16).
Consider the largest part \({\mathbb {I}}^{c}_{{\textbf{X}}_i}=\left\{ \ell \in \{1,\ldots ,d\}~;X_{i\ell } \in [2h_{i\ell }, \infty )\right\} \). Following again (Chen 2000, pp. 474-475), we assume that for all \(X_{i\ell }\in [2h_{i\ell }, \infty )\) one has \(X_{i\ell }/h_{i\ell } \rightarrow \infty \) as \(n\rightarrow \infty \) for all \(\ell \in \{1,2,\ldots ,d\}\). From (2), (3), the Sterling formula \(\Gamma (z+1)\simeq \sqrt{2\pi }z^{z+1/2}\exp (-z)\) as \(z\rightarrow \infty \), and the well-known property \(\Gamma (z)=z^{-1}\Gamma (z+1)\) for \(z>0\), the term (17) can be successively calculated as
with \(B_{ij\ell }(\alpha ,\beta _\ell )= [X_{j\ell }^{-1}\Gamma (\alpha +1/2)]/(\beta _{\ell }^{-\alpha }X_{i\ell }^{-1/2}\sqrt{2\pi }[C_{ij\ell }(\beta _\ell )]^{\alpha +1/2})\), \(C_{ij\ell }(\beta _\ell )= X_{i\ell }\log (X_{i\ell }/X_{j\ell })+X_{j\ell }-X_{i\ell }+\beta _{\ell }\) and \(IG_{\alpha +1/2,C_{ij\ell }(\beta _\ell )}(h_{i\ell })\) is given in (16).
Combining (18) and (19), the expression \(N({{\textbf {h}}}_{i}\mid {{\textbf {X}}}_{i})\) in (17) becomes
From (20), the denominator is successively computed as follows
with \(D_{i}(\alpha ,\varvec{\beta })=p_{d}({\textbf{X}}_i;\widehat{\varvec{\theta }}_n)\sum _{j=1,j\ne i}^{n}\left( p_{d}({\textbf{X}}_{j};\widehat{\varvec{\theta }}_n)\right) ^{-1}\left( \prod _{k \in {\mathbb {I}}_{{\textbf{X}}_i}}A_{ijk}(\alpha ,\beta _k)\right) \left( \prod _{\ell \in {\mathbb {I}}^{c}_{{\textbf{X}}_i}} B_{ij\ell }(\alpha ,\beta _\ell )\right) \). Finally, the ratio of (20) and (21) leads to Part (i).
(ii) We remind that the mean of the inverse gamma distribution \(\mathcal{I}\mathcal{G}(\alpha ,\beta _\ell )\) is \(\beta _\ell /(\alpha -1)\) and \({\mathbb {E}}(h_{i\ell }\mid {\textbf{X}}_{i})=\int _{0}^{\infty } h_{i\ell }\pi (h_{im}\mid {\textbf{X}}_{i})\,dh_{im}\) with \(\pi (h_{im}\mid {\textbf{X}}_{i})\) is the marginal distribution \(h_{im}\) obtained by integration of \(\pi ({\textbf{h}}_{i}\mid {\textbf{X}}_{i})\) for all components of \({\textbf{h}}_{i}\) except \(h_{im}\). Then, \(\pi (h_{im}\mid {\textbf{X}}_{i})=\int _{[0, \infty )^{d-1}}\pi ({\textbf{h}}_{i}\mid {\textbf{X}}_{i})\,d{\textbf{h}}_{i(-m)}\) where \(d{\textbf{h}}_{i(-m)}\) is the vector \(d{\textbf{h}}_{i}\) without the \(m^{th}\) component. If \(m\in {\mathbb {I}}_{{\textbf{X}}_i}\), one has
and
If \(m\in {\mathbb {I}}_{{\textbf{X}}_i}^{c}\) and \(\alpha >1/2\), one gets
and
Combining (22) and (23), we therefore get the closed expression of Part (ii). \(\square \)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Somé, S.M., Kokonendji, C.C., Adjabi, S. et al. Multiple combined gamma kernel estimations for nonnegative data with Bayesian adaptive bandwidths. Comput Stat 39, 905–937 (2024). https://doi.org/10.1007/s00180-023-01327-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-023-01327-7