Affinity Propagation and Uncapacitated Facility Location Problems

Michael J. Brusco¹ &
Douglas Steinley²

206 Accesses
5 Citations
Explore all metrics

Abstract

One of the most important distinctions that must be made in clustering research is the difference between models (or problems) and the methods for solving those problems. Nowhere is this more evident than with the evaluation of the popular affinity propagation algorithm (apcluster.m), which is a MATLAB implementation of a neural clustering method that has received significant attention in the biological sciences and other disciplines. Several authors have undertaken comparisons of apcluster.m with methods designed for models that fall within the class of uncapacitated facility location problems (UFLPs). These comparative models include the p-center (or K-center) model and, more importantly, the p-median (or K-median) model. The results across studies are conflicting and clouded by the fact that, although similar, the optimization model underlying apcluster.m is slightly different from the p-median model and appreciably different from the pcenter model. In this paper, we clarify that apcluster.m is actually a heuristic for a ‘maximization version’ of another model in the class of UFLPs, which is known as the simple plant location problem (SPLP). An exact method for the SPLP is described, and the apcluster.m program is compared to a fast heuristic procedure (sasplp.m) in both a simulation experiment and across numerous datasets from the literature. Although the exact method is the preferred approach when computationally feasible, both apcluster.m and sasplp.m are efficient and effective heuristic approaches, with the latter slightly outperforming the former in most instances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

References

AGMON, S. (1954), “The Relaxation Method for Linear Inequalities”, Canadian Journal of Mathematics, 6, 382–392.
ALBA, E., and DOMINGUEZ, E. (2006), “Comparative Analysis of Modern Optimization Tools for the p-median Problem”, Statistics and Computing, 16, 251–260.
Article MathSciNet Google Scholar
ALP, O., ERKUT, E., and DREZNER, Z. (2003), “An Efficient Genetic Algorithm for the p-Median Problem”, Annals of Operations Research, 122, 21–42.
Article MATH MathSciNet Google Scholar
ANDERSON, E. (1935), “The Irises of the Gaspé Peninsula”, Bulletin of the American Iris Society, 59, 2–5.
APELTSIN, L., MORRIS, J.H., BABBITT, P.C., and FERRIN, T.E. (2011), “Improving the Quality of Protein Similarity Network Clustering Algorithms Using the Network Edge Weight Distribution”, Bioinformatics, 27, 326–333.
Article Google Scholar
AVELLA, P., SASSANO, A., and VASIL’EV, I. (2007), “Computational Study of Large-Scale p-Median Problems”, Mathematical Programming A, 109, 89–114.
BALINSKI, M.L. (1965), “Integer Programming: Methods, Uses, Computation”, Management Science, 12, 253–313.
Article MATH MathSciNet Google Scholar
BELTRAN, C., TADONKI, C., and VIAL, J. (2006), “Solving the p-Median Problem with a Semi-Lagrangian Relaxation”, Computational Optimization and Applications, 35(2), 239–260.
Article MATH MathSciNet Google Scholar
BERROU, C., GLAVIEUX, A., and THITIMAJSHIMA, P. (1993), “Near Shannon Limit Error-Correcting Coding and Decoding: Turbo-Codes (1)”, in Proceedings of the IEEE International Conference on Communications, ICC 93, Vol. 2, pp. 1064–1070.
BILDE, O., and KRARUP, J. (1977), “Sharp Lower Bounds and Efficient Algorithms for the Simple Plant Location Problem”, Annals of Discrete Mathematics, 1, 79–88.
Article MathSciNet Google Scholar
BLANCHARD, S.J., ALOISE, D., and DESARBO, W.S. (2012), “The Heterogeneous pmedian Problem for Categorization Based Clustering”, Psychometrika, 77, 741–762.
Article MATH MathSciNet Google Scholar
BRUSCO, M.J., and CRADIT, J.D. (2001), “A Variable Selection Heuristic for K-means Clustering”, Psychometrika, 66, 249–270.
Article MATH MathSciNet Google Scholar
BRUSCO, M.J., and KÖHN, H.-F. (2008a), “Comment on ‘Clustering by Passing Messages Between Data Points’”, Science, 319, 726c.
BRUSCO, M.J., and KÖHN, H.-F. (2008b), “Optimal Partitioning of a Data Set Based on the p-median Model”, Psychometrika, 73, 89–105.
Article MATH MathSciNet Google Scholar
BRUSCO, M.J., and KÖHN, H.-F. (2009), “Exemplar-Based Clustering via Simulated Annealing”, Psychometrika, 74, 457–475.
Article MATH MathSciNet Google Scholar
BRUSCO, M.J., and STEINLEY, D. (2007), “A Comparison of Heuristic Procedures for Minimum Within-Cluster Sums of Squares Partitionin”, Psychometrika, 72, 583–600.
Article MATH MathSciNet Google Scholar
CHANG, J.T. (2012), “Deriving Transcriptional Programs and Functional Processes from Gene Expression Databases”, Bioinformatics, 28, 1122–1129.
Article Google Scholar
CHEN, L., CHAN, T.-H., CHOYKE, P.L., HILLMAN, E.M.C., CHI, C.-Y., BHUJWALLA, Z.M., WANG, G., WANG, S.S., SZABO, Z., and WANG, Y. (2011), “CAM-CM: A Signal Deconvolution Tool for in vivo Dynamic Contrast-Enhanced Imaging of Complex Tissues”, Bioinformatics, 27, 2607–2609.
Google Scholar
CHIYOSHI, F., and GALVÃO, R.D. (2000), “A Statistical Analysis of Simulated Annealing Applied to the p-Median Problem”, Annals of Operations Research, 96, 61–74.
Article MATH Google Scholar
CHRISTOFIDES, N., and BEASLEY, J. (1982), “A Tree Search Algorithm for the pmedian Problem”, European Journal of Operational Research, 10, 196–204.
Article MATH MathSciNet Google Scholar
CORNUEJOLS, G., FISHER, M.L., and NEMHAUSER, G.L. (1977), “Location of Bank Accounts to Optimize Float: An Analytic Study of Exact and Approximate Algorithms”, Management Science, 23, 789–810.
Article MATH MathSciNet Google Scholar
DUECK, D. (2009), “Affinity Propagation: Clustering Data by Passing Messages”, Unpublished Doctoral Dissertation, Graduate Department of Electrical and Computer Engineering, University of Toronto.
EFROYMSON, M.A., and RAY, T.L. (1966), “A Branch-and-Bound Algorithm for Plant Location”, Operations Research, 14, 361–375.
Article Google Scholar
EL-SHAIEB, A.M. (1973), “A New Algorithm for Locating Sources Among Destinations”, Management Science, 20, 221–231.
ERLENKOTTER, D. (1978), “A Dual Procedure for Uncapacitated Facility Location”, Operations Research, 26, 992–1009.
FISHER, R.A. (1936), “The Use of Multiple Measurements in Taxonomic Problems”, Annals of Eugenics, 7, 179–188.
Article Google Scholar
FREY, B., and DUECK, D. (2007), “Clustering by Passing Messages Between Data Points”, Science, 315, 972–976.
Article MATH MathSciNet Google Scholar
FREY, B., and DUECK, D. (2008), “Response to Comment on ‘Clustering by Passing Messages Between Data Points’”, Science, 319, 726d.
GALVÃO, R.D. (1980), “A Dual-Bounded Algorithm for the p-median Problem”, Operations Research, 28, 1112–1121.
Article MATH MathSciNet Google Scholar
GALVÃO, R.D. (2004), “Uncapacitated Facility Location Problems: Contributions”, Pesquisa Operacional, 24, 7–38.
Article Google Scholar
GALVÃO, R.D., and RAGGI, L.A. (1989), “A Method for Solving to Optimality Uncapacitated Location Problems”, Annals of Operations Research, 18, 225–244.
Article MATH MathSciNet Google Scholar
GRÖTSCHEL, M., and HOLLAND, O. (1991), “Solution of Large-Scale Symmetric Traveling Salesman Problems”, Mathematical Programming, 51, 141–202.
Article MATH MathSciNet Google Scholar
HAIR, J.F., ANDERSON, R.E., TATHAM, R.L., and BLACK, W.C. (1998), Multivariate Data Analysis (5th ed.), Upper Saddle River, NJ: Prentice Hall.
HAKIMI, S.L. (1964), “Optimum Locations of Switching Centers and the Absolute Centers and Medians of a Graph”, Operations Research, 12, 450–459.
Article MATH Google Scholar
HAKIMI, S.L. (1965), “Optimum Distribution of Switching Centers in a Communication Network and Some Related Graph Theory Problems”, Operations Research, 123, 462–475.
Article MathSciNet Google Scholar
HANJOUL, P., and PEETERS, D. (1985), “A Comparison of Two Dual-Based Procedures for Solving the p-Median Problem”, European Journal of Operational Research, 20, 387–396.
Article MATH MathSciNet Google Scholar
HANSEN, P., and JAUMARD, B. (1997), “Cluster Analysis and Mathematical Programming”, Mathematical Programming, 79, 191–215.
MATH MathSciNet Google Scholar
HANSEN, P., and MLADENOVIĆ, N. (1997), “Variable Neighborhood Search for the p-Median”, Location Science, 5, 207–226.
Article MATH Google Scholar
HANSEN, P., and MLADENOVIĆ, N. (2008), “Complement to a Comparative Analysis of Heuristics for the p-Median Problem”, Statistics and Computing, 18, 41–46.
Article MathSciNet Google Scholar
HANSEN, P., MLADENOVIĆ, N., and PEREZ-BRITO, D. (2001), “Variable Neighborhood Decomposition Search”, Journal of Heuristics, 7, 335–350.
Article MATH Google Scholar
HELD, M., and KARP, R.M. (1970), “The Traveling Salesman Problem and Minimum Spanning Trees”, Operations Research, 18, 1138–1162.
Article MATH MathSciNet Google Scholar
HELD, M., WOLFE, P., and CROWDER, H.P. (1974), “Validation of Subgradient Optimization”, Mathematical Programming, 6, 62–88.
Article MATH MathSciNet Google Scholar
HEINZ, G., PETERSON, L.J., JOHNSON, R.W., and KERK, C.J. (2003), “Exploring Relationships in Body Dimensions”, Journal of Statistical Education, 11, www.amstat.org/publications/jse/v11n2/datasets.heinz.html.
HUBERT, L., and ARABIE, P. (1985), “Comparing Partitions”, Journal of Classification, 2, 193–218.
Article Google Scholar
JARVINEN, P., RAJALA, J., and SINERVO, H. (1972), “A Branch-and-Bound Algorithm for Seeking the p-median”, Operations Research, 20, 173–178.
Article Google Scholar
KARALETSOS, T., STEGLE, O., DREYER, D., WINN, J., and BORGWARDT, K.M. (2012), “ShapePheno: Unsupervised Extraction of Shape Phenotypes from biological Image Collections”, Bioinformatics, 28, 1001–1008.
Article Google Scholar
KAUFMAN, L., and ROUSSEEUW, P.J. (2005), Finding Groups in Data: An Introduction to Cluster Analysis (2nd ed.), New York: Wiley.
KIDDLE, S.J., WINDRAM, O.P.F., MCHATTIE, S., MEAD, A., BEYNON, J., BUCHANAN-WOLLASTON, V., DENBY, K.J., and MUKHERJEE, S. (2010), “Temporal Clustering by Affinity Propagation Reveals Transcriptional Modules in Arabidopsis Thaliana”, Bioinformatics, 26, 355–362.
Article Google Scholar
KLASTORIN, T. (1985), “The p-median Problem for Cluster Analysis: A Comparative Test Using the Mixture Model Approach”, Management Science, 31, 84–95.
KÖHN, H.-F., STEINLEY, D., and BRUSCO, M.J. (2010), “The p-median Model as a Tool for Clustering Psychological Data”, Psychological Methods, 15, 87–95.
Article Google Scholar
KUEHN, A.A., and HAMBURGER, M.J. (1963), “A Heuristic Program for Locating Warehouses”, Management Science, 9, 643–666.
Article Google Scholar
LEVANOVA, T., and LORESH, M.A. (2004), “Algorithms of Ant System and Simulated Annealing for the p-median Problem”, Automation and Remote Control, 65, 431–438.
Article MATH Google Scholar
LIN, S., and KERNIGHAN, B.W. (1973), “An Effective Heuristic Algorithm for the Traveling Salesman Problem”, Operations Research, 21, 498–516.
Article MATH MathSciNet Google Scholar
MARANZANA, F.E. (1964), “On the Location of Supply Points to Minimize Transportation Costs”, Operations Research, 12, 138–139.
Google Scholar
MATHWORKS, INC. (2006), Using MATLAB (Version 7), Natick MA: The MathWorks, Inc.
MÉZARD, M., PARISI, G., and ZECCHINA, R. (2002), “Analytic and Algorithmic Solution of Random Satisfiability Problems”, Science, 297, 812–815.
Article Google Scholar
MILLIGAN, G.W. (1980), “An Examination of the Effects of Six Types of Error Perturbation on Fifteen Clustering Algorithms”, Psychometrika, 45, 325–342.
Article Google Scholar
MILLIGAN, G.W. (1996), “Clustering Validation: Results and Implications for Applied Analyses.” in Clustering and Classification, eds. P. Arabie, L.J. Hubert, and G. De Soete, River Edge NJ: World Scientific Publishing, pp. 321–375.
MILLIGAN, G.W., and COOPER, M.C. (1985), “An Examination of Procedures for Determining the Number of Clusters in a Data Set”, Psychometrika, 50, 159–179.
Article Google Scholar
MILLIGAN, G.W., and COOPER, M.C. (1988), “A Study of the Standardization of Variables in Cluster Analysis”, Journal of Classification, 5, 181–204.
Article MathSciNet Google Scholar
MLADENOVIĈ, N., BRIMBERG, J., HANSEN, P., and MORENO-PÉREZ, J.A. (2007), “The p-Median Problem: A Survey of Metaheuristic Approaches”, European Journal of Operational Research, 179, 927–939.
Article MATH MathSciNet Google Scholar
MOTZKIN, T., and SCHOENBERG, I.J. (1954), “The Relaxation Method for Linear Inequalities”, Canadian Journal of Mathematics, 6, 393–404.
Article MATH MathSciNet Google Scholar
MUKHERJEE, S., and HILL, S.M. (2011), “Network Clustering: Probing Biological Heterogeneity by Sparse Graphical Models”, Bioinformatics, 27, 994–1000.
Article Google Scholar
MULVEY, J.M., and CROWDER, H.P. (1979), “Cluster Analysis: An Application of Lagrangian Relaxation”, Management Science, 25, 329–340.
Article MATH Google Scholar
NARULA, S.C., OGBU, U.I., and SAMUELSON, H.M. (1977), “An Algorithm for the pmedian Problem”, Operations Research, 25, 709–713.
Article MATH MathSciNet Google Scholar
RAO, M. R. (1971), “Cluster Analysis and Mathematical Programming”, Journal of the American Statistical Association, 66, 622–626.
REINELT, G. (2001), TSPLIB, http://www.iwr.uni-heidelberg.de/groups/comopt/software/TSPLIB95.
RESENDE, M.G.C., and WERNECK, R.F. (2004), “A Hybrid Heuristic for the p-median Problem”, Journal of Heuristics, 10, 59–88.
Article MATH Google Scholar
REVELLE, C.S., and SWAIN, R. (1970), “Central Facilities Location”, Geographical Analysis, 2, 30–42.
Article Google Scholar
ROLLAND, E., SCHILLING, D.A. and CURRENT, J.R. (1996), “An Efficient Tabu Search Procedure for the p-Median Problem”, European Journal of Operational Research, 96, 329–342.
Article Google Scholar
SPÄTH, H. (1980), Cluster Analysis Algorithms for Data Reduction and Classification of Objects, New York: Wiley.
SPELLMAN, P.T., SHERLOK, G., ZHANG, M.Q., IYER, V.R., ANDERS, K., EISEN, M.B., BROWN, P.O., BOTSTEIN, D., and FUTCHER, B. (1998), “Comprehensive Identification of Cell Cycle-Regulated Genes of the Yeast Saccharomyces Cerevisiae by Microarray Hybridization”, Molecular Biology of the Cell, 9, 3273–3297.
Article Google Scholar
STEINLEY, D. (2003), “Local Optima in K-means Clustering: What You Don’t Know May Hurt You”, Psychological Methods, 8, 294–304.
STEINLEY, D. (2004), “Properties of the Hubert-Arabie Adjusted Rand Index”, Psychological Methods, 9, 386–396.
STEINLEY, D. (2006), “Profiling Local Optima in K-means Clustering: Developing a Diagnostic Technique”, Psychological Methods, 11, 178–192.
STEINLEY, D., and BRUSCO, M.J. (2007), “Initializing K-means Batch Clustering: A Critical Analysis of Several Techniques”, Journal of Classification, 24, 99–121.
Article MATH MathSciNet Google Scholar
STEINLEY, D., and BRUSCO, M.J. (2008a), “A New Variable Weighting and Selection Procedure for K-means Cluster Analysis”, Multivariate Behavioral Research, 43, 77–108.
Article Google Scholar
STEINLEY, D., and BRUSCO, M.J. (2008b), “Selection of Variables in Cluster Analysis: An Empirical Comparison of Eight Procedures”, Psychometrika, 73, 125–144.
Article MATH MathSciNet Google Scholar
STEINLEY, D., and BRUSCO, M.J. (2011a), “Choosing the Number of Clusters in Kmeans Clustering”, Psychological Methods, 16, 285–297.
Article Google Scholar
STEINLEY, D., and BRUSCO, M.J. (2011b), “Evaluating Mixture-Modeling for Clustering: Recommendations and Cautions”, Psychological Methods, 16, 63–79.
Article Google Scholar
STEINLEY, D., and HENSON, R. (2005), “An Analytic Method for Generating Clusters with Known Overlap”, Journal of Classification, 22, 221–250.
Article MathSciNet Google Scholar
TANG, D., ZHU, Q., and YANG, F. (2010), “A Poisson-Based Adaptive Affinity Propagation Clustering for SAGE Data”, Computational Biology and Chemistry, 34, 63–70.
Article MathSciNet Google Scholar
TEITZ, M.B. and BART, P. (1968), “Heuristic Methods for Estimating the Generalized Vertex Median of a Weighted Graph”, Operations Research, 16, 955–961.
Article MATH Google Scholar
THIZY, J.-M., VAN WASSENHOVE, L., and KHUMAWALA, B. (1985), “Comparison of Exact and Approximate Methods of Solving the Uncapacitated Plant Location Problem”, Journal of Operations Management, 6, 23–34.
Article Google Scholar
VINOD, H. (1969), “Integer Programming and the Theory of Grouping”, Journal of the American Statistical Association, 64, 506–517.
VLASBLOM, J. and WODAK, S.J. (2009), “Markov Clustering versus Affinity Propagation for the Partitioning of Protein Interaction Graphs”, BMC Bioinformatics, 10, 99.
WHITAKER, R. (1983), “A Fast Algorithm for the Greedy Interchange of Large-Scale Clustering and Median Location Problems”, INFOR, 21, 95–108.
WOŹNIAK, M., TIURYN, J., and DUTKOWSKI, J. (2010), “MODEVO: Exploring Modularity and Evolution of Protein Interaction Networks”, Bioinformatics, 26, 1790–1791.
Article Google Scholar
ZHANG, J., LI, D., CHEN, H. and FANG, F. (2011), “Analysis of Activity in fMRI Data Using Affinity Propagation Clustering”, Computer Methods in Biomechanics and Biomedical Engineering, 14, 271–281.

Download references

Author information

Authors and Affiliations

College of Business, Florida State University, Tallahassee, Florida, USA
Michael J. Brusco
Department of Psychological Sciences, University of Missouri-Columbia, 210 McAlester Hall, Columbia, MO, 65203, USA
Douglas Steinley

Authors

Michael J. Brusco
View author publications
You can also search for this author in PubMed Google Scholar
Douglas Steinley
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael J. Brusco.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Brusco, M.J., Steinley, D. Affinity Propagation and Uncapacitated Facility Location Problems. J Classif 32, 443–480 (2015). https://doi.org/10.1007/s00357-015-9187-x

Download citation

Published: 07 October 2015
Issue Date: October 2015
DOI: https://doi.org/10.1007/s00357-015-9187-x

Affinity Propagation and Uncapacitated Facility Location Problems

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Discrete Facility Location in Machine Learning

Analysis of Clustering Algorithms for Facility Location Allocation Problems

k-Means, Ward and Probabilistic Distance-Based Clustering Methods with Contiguity Constraint

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Affinity Propagation and Uncapacitated Facility Location Problems

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Discrete Facility Location in Machine Learning

Analysis of Clustering Algorithms for Facility Location Allocation Problems

k-Means, Ward and Probabilistic Distance-Based Clustering Methods with Contiguity Constraint

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation