Abstract
Contingency tables are often used to display the multivariate frequency distribution of variables of interest. Under the common multinomial assumption, the first step of contingency table analysis is to estimate cell probabilities. It is well known that the unconstrained maximum likelihood estimator (MLE) is given by cell counts divided by the total number of observations. However, in the presence of (complex) constraints on the unknown cell probabilities or their functions, the MLE or other types of estimators may often have no closed form and have to be obtained numerically. In this paper, we focus on finding the MLE of cell probabilities in contingency tables under two common types of constraints: known marginals and ordered marginals/conditionals, and propose a novel approach based on geometric programming. We present two important applications that illustrate the usefulness of our approach via comparison with existing methods. Further, we show that our GP-based approach is flexible, readily implementable, effort-saving and can provide a unified framework for various types of constrained estimation of cell probabilities in contingency tables.
Similar content being viewed by others
References
Alldredge JR, Armstrong DW (1974) Maximum likelihood estimation for multinomial distribution using geometric programming. Technometrics 16(4):585–587
Arrow KJ (1973) Information and economic behavior. Federation of Swedish Industries, Stockholm
Barlow RE, Bartholomew DJ, Bremner JM, Brunk HD (1972) Statistical inference under order restrictions. Wiley, New York
Belman D, Heywood JS (1991) Sheepskin effects in the returns to education: an examination of women and minorities. Rev Econ Stat 73(4):720–724
Bishop J, Fienberg S, Holland P (1975) Discrete multivariate analysis: theory and practice. MIT Press, Cambridge
Bishop YM, Fienberg SE (1969) Incomplete two-dimensional contingency tables. Biometrics 25(1):119–28
Blau PM, Duncan OD (1967) The American occupational structure. Wiley, New York
Boyd SP, Kim S-J, Patil DD, Horowitz MA (2005) Digital circuit optimization via geometric programming. Oper Res 53(6):899–932
Boyd SP, Kim S-J, Vandenberghe L, Hassibi A (2007) A tutorial on geometric programming. Optim Eng 8:67–127
Bricker DL, Kortanek KO, Xu L (1997) Maximum likelihood estimates with order restrictions on probabilities and odds ratios: a geometric programming approach. J Appl Math Decis Sci 1(1):53–65
Chen Z, Bai Z, Sinha BK (2006) Ranked set sampling: theory and applications. Springer, New York
Deming WE, Stephan FF (1940) On a least squares adjustment of a sampled frequency table when the expected marginal totals are known. Ann Math Stat 11(4):427–444
Duffin RJ, Peterson EL, Zener C (1967) Geometric programming—theory and applications. Wiley, New York
Frey J, Ozturk O (2011) Constrained estimation using judgment post-stratification. Ann Inst Stat Math 63:769–789
Frey J, Ozturk O, Deshpande JV (2007) Nonparametric tests for perfect judgment rankings. J Am Stat Assoc 102:708–717
Good I (1965) The estimation of probabilities: an essay on modern Bayesian methods. MIT Press, Cambridge
Good I (1967) A bayesian significance test for multinomial distributions. J R Stat Soc Ser B 29:399–431
Haider SJ (2001) Earnings instability and earnings inequality of males in the united states: 1967–1991. J Labor Econ 19:799–836
Ireland CT, Kullback S (1968) Contingency tables with given marginals. Biometrika 55:179–188
Jewell NP, Kalbfleisch JD (2004) Maximum likelihood estimation of ordered multinomial parameters. Biostatistics 5(2):291–306
Kullback S (1959) Information theory and statistics. Wiley, New York
Lim J, Kim SJ, Wang X (2009a) Estimation of stochastically ordered survival functions by geometric programming. J Comput Graph Stat 18:978–994
Lim J, Wang X, Choi W (2009b) Maximum likelihood estimation of ordered multinomial probabilities by geometric programming. Comput Stat Data Anal 53:889–893
Lim J, Won J-H (2012) Estimation of convex ROC curve: ROC convex hull, conditional, and unconditional maximum likelihood estimators. Mach Learn 88:433–444
Little RJA, Wu MM (1991) Models for contingency tables with known margins when target and sampled populations differ. J Am Stat Assoc 86:87–95
Lohr SL (1999) Sampling: design and analysis. Duxbury, New York
MacEachern SN, Stasny EA, Wolfe DA (2004) Judgment post-stratification with imprecise rankings. Biometrics 60:207–215
Mazumdar M, Jefferson TR (1983) Maximum likelihood estimates for multinomial probabilities via geometric programming. Biometrika 70(1):257–261
McIntyre GA (2005) A method for unbiased selective sampling, using ranked sets. Am Stat 59:230–232
Moreh J (1971) Human capital and economic growth: United Kingdom, 1951–1961. Econ Soc Rev 3(1):73–93
Nesterov Y, Nemirovsky A (1994) Interior-point polynomial methods in convex programming. In: Studies in applied mathematics, vol 13. SIAM, Philadelphia
Orazem PF, Vodopivec M (1995) Winners and losers in transition: returns to education, experience, and gender in slovenia. World Bank Econ Rev 9(2):201–230
Pelz W, Good IJ (1986) Estimating probabilities from contingency tables when the marginal probabilities are known, by using additive objective functions. The Statistician 35(1):45–50
Robertson T, Wright FT, Dykstra RL (1988) Order-restricted inferences. Wiley, New York
Stephan FF (1942) An iterative method of adjusting sample frequency tables when expected marginal totals are known. Ann Math Stat 13:166–178
Stokes SL, Wang X, Chen M (2007) Judgment post-stratification with multiple rankers. J Stat Theory Appl 6:344–359
Thompson JH (1981) Convergence properties of the iterative 1980 census estimator. In: Proceedings of the survey research methods section. The American Statistical AssociationWashington, DC, pp 182–185
Wang X, Stokes L, Lim J, Chen M (2006) Concomitants of multivariate order statistics with application to judgment poststratification. J Am Stat Assoc 101(476):1693–1704
Wang X, Wang K, Lim J (2012) Isotonized CDF estimation from judgment post-stratification data with empty strata. Biometrics 68:194–202
Acknowledgments
Johan Lim’s research was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) (No. 2011-0029104).
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A: Proof of Theorem 1
Theorem 1 deals with three situations in three-way contingency tables, as described in Sect. 2.1.
Proof for situation (i):
The relaxed problem is a GP because (i) the objective is a monomial function that is a special case of posynomial functions; and (ii) all “\(\le \)” constraints are posynomial functions of \(p_{ijk}\). Let \(\{\bar{p}_{ijk}\}\) denote one optimal solution to the GP.
-
1.
If \(\{\bar{p}_{ijk}\}\) satisfies all the equality constraints in (2.2) , then \(\{\bar{p}_{ijk}\}\) provides an optimal solution to (2.2) , including both empty and nonempty cells. The proof is done in this case.
-
2.
Suppose there exists at least one constraint with “\(<\)” held at \(\{\bar{p}_{ijk}\}\) so that \(\{\bar{p}_{ijk}\}\) cannot provide a solution to the original optimization problem in (2.2). Without loss of generality, assume \(\sum _{j,k}\bar{p}_{i^{1}jk}<p_{i^{1}++}\). Then \(\sum _{i}\sum _{j,k}\bar{p}_{ijk}<\sum _{i}p_{i++}=1\), which gives \(\sum _{j}\sum _{i,k}\bar{p}_{ijk}<\sum _{j}p_{+j+}\) and \(\sum _{k}\sum _{i,j}\bar{p}_{ijk}<\sum _{k}p_{++k}\). So there exist \(j^{1}\) and \(k^{1}\) s.t. \(\sum _{i,k}\bar{p}_{ij^{1}k}<p_{+j^{1}+}\) and \(\sum _{i,j}\bar{p}_{ijk^{1}}<p_{++k^{1}}\). Then the cell \((i^{1},j^{1},k^{1})\) must be empty, i.e. \(n_{i^{1}j^{1}k^{1}}=0\). Otherwise, we can construct \(\{\tilde{p}_{ijk}\}\) by increasing \(\bar{p}_{i^{1}j^{1}k^{1}}\) by a very small amount while keeping all the other \(\bar{p}_{ijk}\) unchanged, so that the related three “\(\le \)” constraints still hold. Doing so would increase the value of \(\mathcal {L}\) if the cell \((i^{1},j^{1},k^{1})\) is nonempty, which contradicts that \(\{\bar{p}_{ijk}\}\) is an optimal solution to the GP.
-
3.
Since the cell \((i^{1},j^{1},k^{1})\) is empty, we can increase \(\bar{p}_{i^{1}j^{1}k^{1}}\), while keeping the probabilities of all the other cells unchanged (without increasing the value of \(\mathcal {L}\)), until at least one of the three equality constraints \(\sum _{j,k}p_{i^{1}jk}=p_{i^{1}++}\), \(\sum _{i,k}p_{ij^{1}k}=p_{+j^{1}+}\) and \(\sum _{i,j}p_{ijk^{1}}=p_{++k^{1}}\) holds (i.e., the one with the least slack). Denote the increased value by \(\bar{p}_{i^{1}j^{1}k^{1}}^{*}\). Now update \(\{\bar{p}_{ijk}\}\) by letting \(\bar{p}_{i^{1}j^{1}k^{1}}=\bar{p}_{i^{1}j^{1}k^{1}}^{*}\). Note that the updated \(\{\bar{p}_{ijk}\}\) not only provides an alternative optimal solution to the GP, but also changes at least one inequality constraint from the strict “\(<\)” sign to the “\(=\)” sign.
-
4.
Repeat steps 1 through 3 until all the “\(<\)” constraints are changed to the corresponding “\(=\)” constraints.
After the above steps, the adjusted \(\{\bar{p}_{ijk}\}\) provides an optimal solution to both the GP and the original optimization problem in (2.2). Note that all the adjustments are done to empty cells. Thus, the GP provides an optimal solution to all non-empty cells in (2.2).
Proof for situation (ii):
This proof is omitted for brevity because it is similar to the proof in situation (i) with very slight modifications.
Proof for situation (iii):
Let \(\{\bar{p}_{ijk}\}\) denote one optimal solution to the GP.
-
1.
If \(\{\bar{p}_{ijk}\}\) satisfies all the equality constraints in (2.4) , then \(\{\bar{p}_{ijk}\}\) provides an optimal solution to (2.4), including both empty and nonempty cells. This concludes the proof in this case.
-
2.
Suppose there exists at least one constraint in which “\(<\)” holds at \(\{\bar{p}_{ijk}\}\) so that \(\{\bar{p}_{ijk}\}\) cannot provide a solution to the original optimization problem in (2.4). Without loss of generality, assume \(\sum _{k}\bar{p}_{i^{*}j^{*}k}<p_{i^{*}j^{*}+}\). Then \(\sum _{j}\sum _{k}\bar{p}_{i^{*}jk}<\sum _{j}p_{i^{*}j+}=\sum _{k}p_{i^{*}+k}\), which gives \(\sum _{k}\sum _{j}\bar{p}_{i^{*}jk} <\sum _{k}p_{i^{*}+k}\). So there exists \(k^{*}\) s.t. \(\sum _{j}\bar{p}_{i^{*}jk^{*}}<p_{i^{*}+k^{*}}\). Then the cell \((i^{*},j^{*},k^{*})\) must be empty, i.e. \(n_{i^{*}j^{*}k^{*}}=0\). Otherwise, we can construct \(\{\tilde{p}_{ijk}\}\) by increasing \(\bar{p}_{i^{*}j^{*}k^{*}}\) by a very small amount while keeping all the other \(\bar{p}_{ijk}\) unchanged, so that the related three “\(\le \)” constraints still hold. Doing so would increase the value of \(\mathcal {L}\) if the cell \((i^{*},j^{*},k^{*})\) is nonempty, which contradicts that \(\{\bar{p}_{ijk}\}\) is an optimal solution to the GP.
-
3.
Since the cell \((i^{*},j^{*},k^{*})\) is empty, we can increase \(\bar{p}_{i^{*}j^{*}k^{*}}\), while keeping the probabilities of all the other cells unchanged (without increasing the value of \(\mathcal {L}\)), until at least one of the three equality constraints \(\sum _{j,k}p_{i^{*}jk}=p_{i^{*}++}\), \(\sum _{i,k}p_{ij^{*}k}=p_{+j^{*}+}\) and \(\sum _{i,j}p_{ijk^{*}}=p_{++k^{*}}\) holds (i.e., the one with the least slack). Denote the increased value as \(\bar{p}_{i^{*}j^{*}k^{*}}^{*}\). Now update \(\{\bar{p}_{ijk}\}\) by letting \(\bar{p}_{i^{*}j^{*}k^{*}}=\bar{p}_{i^{*}j^{*}k^{*}}^{*}\). Note that the updated \(\{\bar{p}_{ijk}\}\) not only provides an alternative optimal solution to the GP, but also changes at least one inequality constraint from the strict “\(<\)” sign to the “\(=\)” sign.
-
4.
Repeat steps 1 through 3 until all the “\(<\)” constraints are changed to the corresponding “\(=\)” constraints.
After the above steps, the adjusted \(\{\bar{p}_{ijk}\}\) provides an optimal solution to both the GP and the original optimization problem in (2.4). Note that all the adjustments are done to empty cells. Thus, the GP provides an optimal solution to all non-empty cells in (2.4).
Appendix B: Proof of Theorem 4
We need to show that the GP (3.5) achieves the optimal value only when (3.4) is satisfied. Let \(\{\bar{p}_{jk|i},\bar{p}_{i++}\}\) denote the optimal solution to the GP. Suppose there exists at least one of the terms among \(\sum _{jk}p_{jk|i}\) and \(\sum _{i}p_{i++}\) with “\(<\)1” held at \(\{\bar{p}_{jk|i},\bar{p}_{i++}\}\).
If for some \(i\), say \(i^{*},\, \sum _{jk}\bar{p}_{jk|i^{*}}<1\), then we can obtain \(\{\tilde{p}_{jk|i},\tilde{p}_{i++}\}\) by setting \(\tilde{p}_{j^{*}k^{*}|i^{*}}=\bar{p}_{j^{*}k^{*}|i^{*}}+1-\sum _{jk}\bar{p}_{jk|i^{*}}\), \(\tilde{p}_{jk|i}=\bar{p}_{jk|i}\) for \(i\ne i^{*}\) or \(j\ne j^{*}\) or \(k\ne k^{*}\), and \(\tilde{p}_{i++}=\bar{p}_{i++}\) for all \(i\), where \((j^{*},k^{*})\) satisfies \(n_{i^{*}j^{*}k^{*}}>0\). Note that \(\{\tilde{p}_{jk|i},\tilde{p}_{i++}\}\) satisfies all the inequality constraints in (3.5) with \(\sum _{jk}\tilde{p}_{jk|i^{*}}=1\). Since \(\tilde{p}_{j^{*}k^{*}|i^{*}}>\bar{p}_{j^{*}k^{*}|i^{*}}\), the objective function is smaller at \(\{\tilde{p}_{jk|i},\tilde{p}_{i++}\}\), indicating \(\{\bar{p}_{jk|i},\bar{p}_{i++}\}\) is not optimal. Hence, \(\sum _{jk}\bar{p}_{jk|i}=1\) must hold for all \(i\).
If \(\sum _{i}\bar{p}_{i++}<1\), then we can obtain \(\{\tilde{p}_{jk|i},\tilde{p}_{i++}\}\) by setting \(\tilde{p}_{r++}=\bar{p}_{r++}+1-\sum _{i}\bar{p}_{i++}\), \(\tilde{p}_{i++}=\bar{p}_{i++}\) for \(i\ne r\), and \(\tilde{p}_{jk|i}=\bar{p}_{jk|i}\) for all \(i,j,k\). Note that
and \(\sum _{i}\tilde{p}_{i++}=1\) so that \(\{\tilde{p}_{jk|i},\tilde{p}_{i++}\}\) satisfies all the inequality constraints in (3.6). Since \(\tilde{p}_{r++}>\bar{p}_{r++}\), the objective function is smaller at \(\{\tilde{p}_{jk|i},\tilde{p}_{i++}\}\), indicating \(\{\bar{p}_{jk|i},\bar{p}_{i++}\}\) is not optimal. Hence, \(\sum _{i}\bar{p}_{i++}=1\) must hold.
Appendix C: Proof of Theorem 5
Again, we need to show that the GP (3.6) achieves the optimal value only when (3.4) is satisfied. Since \(\mathcal {C}\subset \mathcal {J}\times \mathcal {K}\), the complement \(\bar{\mathcal {C}}\) is nonempty. Let \(\{\bar{p}_{jk|i},\bar{p}_{i++}\}\) denote the optimal solution to the GP. Suppose there exists at least one of the terms among \(\sum _{jk}p_{jk|i}\) and \(\sum _{i}p_{i++}\) with “\(<\)1” held at \(\{\bar{p}_{jk|i},\bar{p}_{i++}\}\).
If for some \(i\), say \(i^{*},\, \sum _{jk}\bar{p}_{jk|i^{*}}<1\), then we can obtain \(\{\tilde{p}_{jk|i},\tilde{p}_{i++}\}\) by setting \(\tilde{p}_{j^{*}k^{*}|i^{*}}=\bar{p}_{j^{*}k^{*}|i^{*}}+1-\sum _{jk}\bar{p}_{jk|i^{*}}\), \(\tilde{p}_{jk|i}=\bar{p}_{jk|i}\) for \(i\ne i^{*}\) or \(j\ne j^{*}\) or \(k\ne k^{*}\), where \((j^{*},k^{*})\in \bar{\mathcal {C}}\) and \(n_{i^{*}j^{*}k^{*}}>0\), and \(\tilde{p}_{i++}=\bar{p}_{i++}\) for all \(i\). Note that \(\{\tilde{p}_{jk|i},\tilde{p}_{i++}\}\) satisfies all the inequality constraints in (3.6) with \(\sum _{jk}\tilde{p}_{jk|i^{*}}=1\). Since \(\tilde{p}_{j^{*}k^{*}|i^{*}}>\bar{p}_{j^{*}k^{*}|i^{*}}\), the objective function is smaller at \(\{\tilde{p}_{jk|i},\tilde{p}_{i++}\}\), indicating \(\{\bar{p}_{jk|i},\bar{p}_{i++}\}\) is not optimal. Hence, \(\sum _{jk}\bar{p}_{jk|i}=1\) must hold for all \(i\).
If \(\sum _{i}\bar{p}_{i++}<1\), then a similar argument can be made as in the proof of Theorem 4 to show that \(\sum _{i}\bar{p}_{i++}=1\) must hold; that is, we can obtain \(\{\tilde{p}_{jk|i},\tilde{p}_{i++}\}\) by setting \(\tilde{p}_{1++}=\bar{p}_{1++}+1-\sum _{i}\bar{p}_{i++}\), \(\tilde{p}_{i++}=\bar{p}_{i++}\) for \(i\ne 1\), and \(\tilde{p}_{jk|i}=\bar{p}_{jk|i}\) for all \(i,j,k\), which gives a smaller value of the objective function than \(\{\bar{p}_{jk|i},\bar{p}_{i++}\}\).
Rights and permissions
About this article
Cite this article
Wang, X., Lim, J., Kim, SJ. et al. Estimating cell probabilities in contingency tables with constraints on marginals/conditionals by geometric programming with applications. Comput Stat 30, 107–129 (2015). https://doi.org/10.1007/s00180-014-0525-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-014-0525-y