-
Notifications
You must be signed in to change notification settings - Fork 835
haproxy-3.1.7 aborted with large map_sub #2949
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Ouch, 2.2M lines ??? map_sub() and map_reg() are the most expensive matches, they have no other option but try every single pattern one after the other. Usually, 1000 lines already provide catastrophic performance and incur noticeable latencies to the whole process. But 2.2M I've never saw that yet :-/ It will definitely take a few seconds during which all traffic stalls, which precisely is the purpose of the watchdog, detect that the process is no longer making any progress. In 3.2 we've implemented yielding at the rule level, in order to minimize the latency caused by many rules, but we don't have anything to minimize the latency impact of a single rule. Maybe one dirty work around could be to split your huge map_sub into smaller pieces evaluated in distinct rules, and set There is very likely another solution. Algorithmically speaking it makes no sense to look for 2.2M patterns at random places (which is what map_sub does). It looks like a heavier form of what anti-virus do. And even the risk of unexpected match of a pattern inside another one is huge if all patterns are not the same size. Isn't it possible instead to extract the part that is supposed to be matched against, and compare it to fixed size values ? That way the research of the location would be done once, and then the match against known values would be performed in O(logN), no longer O(N*input_size). |
Hi @wtarreau thanks for the fast reply! Sure, that's what we did. When changed to an exact match on the map file:
Everything went smoothly. You can put that on my Wall of Shame, since I didn't know at first the cost of map_sub. But hey, with less than 1.9M lines it was no biggie, so that's a plus ;-) I just thought a bug report here might shed the light on this. This was one those cases that the map file started with 20 entries and was updated programmatically, and we noticed today when it exploded :-) Feel free to mark it as a non-bug. |
I would really love to emit some warnings when loading such maps/acls with too many lines, but the problem is that some users would consider that no warning = valid config. But even a single regex can be constructed to take multiple days to evaluate, so the length is not all. But maybe in a case like this, it would at least warn you that the bot feeding the map is going out of control, so maybe that could be useful anyway. It's possible that the doc is not clear enough about the dangers. But yeah, at least the issue may serve to help someone else facing the same in the future. I'm glad you could work it out! |
Detailed Description of the Problem
On some machines, we're using haproxy to do path-based routing to certain servers, in order to serve video traffic. we map certain UUIDs to certain backends in a map-file. Using
path,lower,map_sub
with a large enough map file (around 2,2M lines) haproxy ended up using all cores with %usr and ended up crashing.Expected Behavior
Don't crash.
Steps to Reproduce the Behavior
Load a 2M map file and trying to process it with map_sub
Do you have any idea what may have caused this?
Not really, but I can confirm changing it to an exact match of the path fixed the problem.
Our URL is usually:
/vod/<uuid>/bla.m3u8
At first we were matching with
path,lower,map_sub(/etc/haproxy/uuid.map)
, but later changing topath,lower,field(3,/),map(/etc/haproxy/uuid.map
fixed the issue.We get around 5000 RPS on this service.
Do you have an idea how to solve the issue?
In our case, doing an exact-match on the map file.
What is your configuration?
Output of
haproxy -vv
Last Outputs and Backtraces
Additional Information
Since we were using 3.0 branch in production, I tried compiling 3.1 in order to see if it helped. So I can confirm the issue exists on both 3.0 and 3.1 branches in our use case.
Crash message attached here.
haproxy-map-3.1-abort.txt
The text was updated successfully, but these errors were encountered: