performance: add circuit breakers in the ExclusiveCartesianSolver #157
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
this simple PR adds two 'circuit breakers' within the
ExclusiveCartesianSolver
to mitigate the effects of long-running complex queries.some (particularly long) queries can generate a large number of potential solutions, the main culprit is the
ExclusiveCartesianSolver
, which, as the name implies, produces a cross product of all non-overlapping classifications.the problem is that in some degenerate cases (such as the example below) the sheer number of potential solutions can stall the CPU and therefore cascade performance issues on to any codebase which includes this module.
an obvious solution here is to add some 'circuit breakers' which detect if some of these arrays of values grow beyond a set threshold and prevents them from growing any larger.
the good news is that we can actually set these constants fairly high and still achieve our goals, the degenerate 'slow queries' are usually an order of magnitude slower, so simply providing any ceiling seems to be sufficient to prevent long running queries.
note: there is naturally a tradeoff here, we are gaining performance guarantees at the expense of discarding potentially superior solutions.
The way I've tuned this, all of the existing test cases pass (including the longer/complex examples).
So in a real-world scenario I suspect we will see no measurable difference in quality but with a significant reduction in worst-case performance, these constants may be adjusted/increased in the future if a need arises.
node bin/cli.js 'c mk v ba 1994 2000 323 c v ba 1994 2000 323 f iv bg 1987 1994 323 f mk iv bg 1987 1994 323 f mk v ba 1994 1998 323 f v ba 1994 1998 323 f vi bj 1998 2004 323 f/p mk vi bj 1998 2004 323 iii bf 1985 1991 323 1998 323 iii седан bf 1985 1991 323 iv bg 1989 1994 323 iv, седан'
# master ================================================================ SOLUTIONS (13962ms) ----------------------------------------------------------------