[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Solving large problems with heuristic search: general-purpose parallel external-memory search

Published: 01 May 2018 Publication History

Abstract

Classic best-first heuristic search algorithms, like A*, record every unique state they encounter in RAM, making them infeasible for solving large problems. In this paper, we demonstrate how best-first search can be scaled to solve much larger problems by exploiting disk storage and parallel processing and, in some cases, slightly relaxing the strict best-first node expansion order. Some previous disk-based search algorithms abandon best-first search order in an attempt to increase efficiency. We present two case studies showing that A*, when augmented with Delayed Duplicate Detection, can actually be more efficient than these non-best-first search orders. First, we present a straightforward external variant of A*, called PEDAL, that slightly relaxes best-first order in order to be I/O efficient in both theory and practice, even on problems featuring real-valued node costs. Because it is easy to parallelize, PEDAL can be faster than in-memory IDA* even on domains with few duplicate states, such as the sliding-tile puzzle. Second, we present a variant of PEDAL, called PE2A*, that uses partial expansion to handle problems that have large branching factors. When tested on the problem of Multiple Sequence Alignment, PE2A* is the first algorithm capable of solving the entire Reference Set 1 of the standard BAliBASE benchmark using a biologically accurate cost function. This work shows that classic best-first algorithms like A* can be applied to large real-world problems. We also provide a detailed implementation guide with source code both for generic parallel disk-based best-first search and for Multiple Sequence Alignment with a biologically accurate cost function. Given its effectiveness as a general-purpose problem-solving method, we hope that this makes parallel and disk-based search accessible to a wider audience.

References

[1]
Altschul, S. F. (1989). Gap costs for multiple sequence alignment. Journal of Theoretical Biology, 138 (3), 297-309.
[2]
Bonet, B. (2008). Efficient algorithms to rank and unrank permutations in lexicographic order. In AAAI-Workshop on Search in AI and Robotics.
[3]
Burns, E., Hatem, M., Leighton, M. J., & Ruml, W. (2012). Implementing fast heuristic search code. In Proceedings of the Symposium on Combinatorial Search (SoCS-12).
[4]
Burns, E., Lemons, S., Ruml, W., & Zhou, R. (2010). Best-first heuristic search for multicore machines. Journal of Artificial Intelligence Research, 39, 689-743.
[5]
Burns, E., & Ruml, W. (2013). Iterative-deepening search with on-line tree size prediction. Annals of Mathematics and Artificial Intelligence, S68, 1-23.
[6]
Carrillo, H., & Lipman, D. (1988). The multiple sequence alignment problem in biology. SIAM J. on Applied Mathematics, 48 (5), 1073-1082.
[7]
Dayhoff, M. O., Schwartz, R. M., & Orcutt, B. C. (1978). A model of evolutionary change in proteins. Atlas of protein sequence and structure, 5 (suppl 3), 345-351.
[8]
Dean, J., & Ghemawat, S. (2008). Mapreduce: simplified data processing on large clusters. Communications of the ACM, 51 (1), 107-113.
[9]
Edelkamp, S., Jabbar, S., & Schrdl, S. (2004). External A*. In Advances in Artificial Intelligence, Vol. 3238, pp. 226-240. Springer Berlin / Heidelberg.
[10]
Edelkamp, S., & Kissmann, P. (2007). Externalizing the multiple sequence alignment problem with affine gap costs. In KI 2007: Advances in Artificial Intelligence, pp. 444-447. Springer Berlin / Heidelberg.
[11]
Felner, A., Goldenberg, M., Sharon, G., Stern, R., Beja, T., Sturtevant, N. R., Schaeffer, J., & Holte, R. (2012). Partial-Expansion A* with selective node generation. In Proceedings of AAAI-2012.
[12]
Ghallab, M., Nau, D., & Traverso, P. (2004). Automated Planning: Theory and Practice. Morgan Kaufmann Publishers.
[13]
Hart, P. E., Nilsson, N. J., & Raphael, B. (1968). A formal basis for the heuristic determination of minimum cost paths. IEEE Transactions of Systems Science and Cybernetics, SSC-4 (2), 100-107.
[14]
Hatem, M., Burns, E., & Ruml, W. (2011). Heuristic search for large problems with real costs. In Proceedings of AAAI-2011.
[15]
Hatem, M., Burns, E., & Ruml, W. (2013). Faster problem solving in java with heuristic search.
[16]
Hatem, M., Burns, E., & Ruml, W. (2014). Research code for heuristic search. https://github.com/matthatem. Accessed June 28, 2014.
[17]
Hatem, M., Kiesel, S., & Ruml, W. (2015). Recursive best-first search with bounded overhead. In Proceedings of AAAI-2015.
[18]
Hatem, M., & Ruml, W. (2013). External memory best-first search for multiple sequence alignment. In Proceedings of AAAI-2013.
[19]
Ikeda, T., & Imai, H. (1999). Enhanced A* algorithms for multiple alignments: optimal alignments for several sequences and k-opt approximate alignments for large cases. Theoretical Computer Science, 210 (2), 341-374.
[20]
Kobayashi, H., & Imai, H. (1998). Improvement of the A* algorithm for multiple sequence alignment. In Proceedings of the 9th Workshop on Genome Informatics, pp. 120-130.
[21]
Korf, R. (2012). Research challenges in combinatorial search. In Proceedings of AAAI-2012.
[22]
Korf, R. E. (1985). Depth-first iterative-deepening: An optimal admissible tree search. Artificial Intelligence, 27 (1), 97-109.
[23]
Korf, R. E. (1993). Linear-space best-first search. Artificial Intelligence, 62 (1), 41-78.
[24]
Korf, R. E. (1999). Divide-and-conquer bidirectional search: First results. In Proceedings of the Sixteenth International Joint Conference on Articial Intelligence (IJCAI-99), Vol. 99, pp. 1184-1189.
[25]
Korf, R. E. (2003). Delayed duplicate detection. In Proceedings of the Eighteenth International Joint Conference on Articial Intelligence (IJCAI-03), pp. 1539-1541.
[26]
Korf, R. E. (2004). Best-first frontier search with delayed duplicate detection. In Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI-04), pp. 650- 657. AAAI Press.
[27]
Korf, R. E. (2008a). Linear-time disk-based implicit graph search. Journal of the ACM, 55 (6).
[28]
Korf, R. E. (2008b). Minimizing disk I/O in two-bit breadth-first search. In Proceedings of AAAI-08, pp. 317-324.
[29]
Korf, R. E. (2016). Comparing search algorithms using sorting and hashing on disk and in memory. In Proceedings of the Twenty-Fifth International Joint Conference on Articial Intelligence (IJCAI-16), pp. 610-616.
[30]
Korf, R. E., Zhang, W., Thayer, I., & Hohwald, H. (2005). Frontier search. Journal of the ACM, 52 (5), 715-748.
[31]
Lermen, M., & Reinert, K. (2000). The practical use of the A* algorithm for exact multiple sequence alignment. Journal of Computational Biology, 7 (5), 655-671.
[32]
Needleman, S. B., & Wunsch, C. D. (1970). A general method applicable to the search for similarities in the amino acid sequences of two proteins. Journal of Molecular Biology, 48, 443-453.
[33]
Niewiadomski, R., Amaral, J. N., & Holte, R. C. (2006). Sequential and parallel algorithms for Frontier A* with delayed duplicate detection. In Proceedings of AAAI-2006, pp. 1039-1044. AAAI Press.
[34]
Reinefeld, A., & Schnecke, V. (1994). Aida*-asynchronous parallel ida*. In Proceedings of the Biennial Conference-Canadian Society for Computational Studies of Intelligence, pp. 295-302.
[35]
Reinefeld, A., & Schütt, T. (2010). Out-of-core parallel frontier search with mapreduce. In High Performance Computing Systems and Applications, pp. 323-336. Springer.
[36]
Russell, S. (1992). Efficient memory-bounded search methods.
[37]
Sarkar, U., Chakrabarti, P., Ghose, S., & Sarkar, S. D. (1991). Reducing reexpansions in iterative-deepening search by controlling cutoff bounds. Artificial Intelligence, 50, 207-221.
[38]
Schmidt, T., & Zhou, R. (2011). Succinct set-encoding for state-space search. In Proceedings of the Twenty-Fifth Conference on Artificial Intelligence (AAAI-2011).
[39]
Schroedl, S. (2005). An improved search algorithm for optimal multiple-sequence alignment. Journal of Artificial Intelligence Research, 23, 587-623.
[40]
Thompson, Plewniak, & Poch (1999). BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics, 15.
[41]
Yoshizumi, T., Miura, T., & Ishida, T. (2000). A* with partial expansion for large branching factor problems. In Proceedings of AAAI-2000, pp. 923-929.
[42]
Zhou, R., & Hansen, E. (2009). Dynamic state-space partitioning in external-memory graph search. In The 2009 International Symposium on Combinatorial Search (SOCS-09).
[43]
Zhou, R., & Hansen, E. A. (2003a). Sparse-memory graph search. In IJCAI, pp. 1259-1268.
[44]
Zhou, R., & Hansen, E. A. (2003b). Sweep A*: Space-efficient heuristic search in partially ordered graphs. In Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence (ICTAI-03).
[45]
Zhou, R., & Hansen, E. A. (2004). Structured duplicate detection in external-memory graph search. In Proceedings of AAAI-2004.
[46]
Zhou, R., & Hansen, E. A. (2006). Breadth-first heuristic search. Artificial Intelligence, 170 (4-5), 385-408.

Cited By

View all
  • (2019)Iterative budgeted exponential searchProceedings of the 28th International Joint Conference on Artificial Intelligence10.5555/3367032.3367210(1249-1257)Online publication date: 10-Aug-2019
  1. Solving large problems with heuristic search: general-purpose parallel external-memory search

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Journal of Artificial Intelligence Research
    Journal of Artificial Intelligence Research  Volume 62, Issue 1
    May 2018
    871 pages
    ISSN:1076-9757
    Issue’s Table of Contents

    Publisher

    AI Access Foundation

    El Segundo, CA, United States

    Publication History

    Published: 01 May 2018
    Published in JAIR Volume 62, Issue 1

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 31 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)Iterative budgeted exponential searchProceedings of the 28th International Joint Conference on Artificial Intelligence10.5555/3367032.3367210(1249-1257)Online publication date: 10-Aug-2019

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media