Abstract
We present an open-source R package (MESgenCov v 0.1.0) for temporally fitting multivariate precipitation chemistry data and extracting a covariance matrix for use in the MESP (maximum-entropy sampling problem). We provide multiple functionalities for modeling and model assessment. The package is tightly coupled with NADP/NTN (National Atmospheric Deposition Program/National Trends Network) data from their set of 379 monitoring sites, 1978–present. The user specifies the sites, chemicals, and time period desired, fits an appropriate user-specified univariate model for each site and chemical selected, and the package produces a covariance matrix for use by MESP algorithms.
Notes
National Atmospheric Deposition Program (NRSP-3). 2019. NADP Program Office, Wisconsin State Laboratory of Hygiene, 465 Henry Mall, Madison, WI 53706, USA
Reprinted with the kind permission of the National Atmospheric Deposition Program (NRSP-3). 2019. NADP Program Office, Wisconsin State Laboratory of Hygiene, 465 Henry Mall, Madison, WI 53706, USA.
References
Anstreicher KM, Fampa M, Lee J, Williams J (1999) Using continuous nonlinear relaxations to solve constrained maximum-entropy sampling problems. Math Program 85(2, Ser. A):221–240
Anstreicher KM, Lee J (2004) A masked spectral bound for maximum-entropy sampling. In mODa 7—advances in model-oriented design and analysis, Contrib. Statist., pages 1–12. Physica, Heidelberg
Anstreicher KM (2018) Efficient solution of maximum-entropy sampling problems Preprint available at: https://www.biz.uiowa.edu/faculty/anstreicher/papers/linx.pdf
Anstreicher KM (2018) Maximum-entropy sampling and the boolean quadric polytope. J Glob Optim 72(4):603–618
Burer S, Lee J (2007) Solving maximum-entropy sampling problems using factored masks. Math Program 109(2-3, Ser. B):263–281
Brown PJ, Le ND, Zidek JV (1994) Multivariate spatial interpolation and exposure to air pollutants. Canad J Statist 22(4):489–509
Chen Z, Fampa M, Lambert A, Lee J (2020) Mixing convex-optimization bounds for maximum-entropy sampling. arXiv:2001.11896
Fedorov V, Lee J (2000) Design of experiments in statistics. In: Handbook of semidefinite programming, vol 27 of Internat. Ser. Oper. Res. Management Sci., pages 511–532. Kluwer Acad. Publ., Boston, MA
Guttorp P, Le ND, Sampson PD, Zidek JV (1993) Using entropy in the redesign of an environmental monitoring network. In: Multivariate environmental statistics, vol 6 of North-Holland Ser. Statist. Probab., pp 175–202. North-Holland, Amsterdam
Goerg GM (2011) Lambert W random variables - a new family of generalized skewed distributions with applications to risk estimation. Ann Appl Stat 5(3):2197–2230
Goerg GM (2016) LambertW: Probabilistic models to analyze and gaussianize heavy-tailed, skewed data. Version 0.6.4. https://cran.r-project.org/web/packages/LambertW/
Hoffman AJ, Lee J, Williams J (2001) New upper bounds for maximum-entropy sampling. In: mODa 6—advances in model-oriented design and analysis (Puchberg/Schneeberg, 2001), Contrib. Statist., pages 143–153. Physica, Heidelberg
Korkmaz S, Goksuluk D, Zararsiz G (2014) MVN: An R package for assessing multivariate normality. R J 6(2):151–162
Ko C-W, Lee J, Queyranne M (1995) An exact algorithm for maximum entropy sampling. Oper Res 43(4):684–691
Lee J (1998) Constrained maximum-entropy sampling. Oper Res 46(5):655–664
Lee J (2012) Encyclopedia of environmetrics. In: El-Shaarawi AH, Piegorsch WW (eds) chapter Maximum entropy sampling. 2nd edn. Wiley, New York, pp 1570–1574
Lee J, Williams J (2003) A linear integer programming bound for maximum-entropy sampling. Math Program 94(2-3, Ser. B):247–256
Le ND, Zidek JV (2006) Statistical analysis of environmental space-time processes. Springer Series in Statistics. Springer, New York
Le N, Zidek J, White R, Cubranic D (2019) EnviroStat: Statistical analysis of environmental space-time processes. Version 0.4-2 https://rdrr.io/cran/EnviroStat/
Millard SP (2013) EnvStats: An R package for environmental statistics. Springer, New York
NADP (2018) National acidic deposition program, national trends network https://nadp.slh.wisc.edu/ntn/
Rencher AC, Christensen WF (2012) Methods of multivariate analysis wiley series in probability and statistics. Wiley, New York
Shewry MC, Wynn HP (1987) Maximum entropy sampling. J Appl Stat 14(2):165–170
Sebastiani P, Wynn Henry P (2000) Maximum entropy sampling and optimal bayesian experimental design. J R Stat Soc Ser B (Stat Methodol) 62 (1):145–157
Zidek JV, Sun W, Le ND (2000) Designing and integrating composite networks for monitoring multivariate Gaussian pollution fields. Journal of the Royal Statistical Society. Series C. Applied Statistics 49(1):63–79
Acknowledgments
The authors are very grateful to Dr. Martin Shafer and Robert Larson for helping us gain access to the NADP/NTN data in a convenient form.
Funding
J. Lee was funded by the Air Force Office of Scientific Research (Complex Networks program), FA9550-19-1-0175. H. Al-Thani was funded by the Qatar National Research Fund (Graduate Sponsorship Research Award), GSRA4-2-0526-17114.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix
A NADP/NTN Data Descriptions
B Geographic Split
C Internal Covariance Matrices Site Lists
Rights and permissions
About this article
Cite this article
Al-Thani, H., Lee, J. An R Package for Generating Covariance Matrices for Maximum-Entropy Sampling from Precipitation Chemistry Data. SN Oper. Res. Forum 1, 17 (2020). https://doi.org/10.1007/s43069-020-0011-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s43069-020-0011-z