8000 feat: 933151 change from capture and double `pmf` to regex by TimDiam0nd · Pull Request #4139 · coreruleset/coreruleset · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

feat: 933151 change from capture and double pmf to regex #4139

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
May 30, 2025

Conversation

TimDiam0nd
Copy link
Contributor

same as #4138, except a different rule id.

Copy link
Contributor
github-actions bot commented May 20, 2025

📊 Quantitative test results for language: eng, year: 2023, size: 10K, paranoia level: 1:
🚀 Quantitative testing did not detect new false positives

Copy link
Contributor
@theseion theseion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The length of the regular expression is probably going to be an issue. At least in ModSecurity v2 on httpd the length of the rule is restricted. This is why we had to split the RCE rules into < 4 character words, >=4 character words, plus generic detection.

Co-authored-by: Max Leske <250711+theseion@users.noreply.github.com>
@TimDiam0nd
Copy link
Contributor Author

@theseion thanks for the suggestion to the regex assembly, much cleaner than what i wrote.

In regards to the regex length issue, would it not be better to instead dynamically generate multiple rules based off the single rule and regex assembly?

e.g. for 933151, split the include file into e.g. 4 parts, and then duplicate the rule (and increment the id).

That way we arent "compromising" the core rule logic for all consumers due to a limitation in one place.

@theseion
Copy link
Contributor

In regards to the regex length issue, would it not be better to instead dynamically generate multiple rules based off the single rule and regex assembly?

Maybe. Generating rules is on our roadmap but we currently don't generate entire rules, only regular expressions.

What about performance? Would three rules instead of one not completely eat up the performance gain of this PR? If not, I suggest you split php-function-names.data into three parts (e.g., a-j, k-t, s-z) and create the three sibling rules from that manually.

@TimDiam0nd
Copy link
Contributor Author

I did some testing and splitting out the rule into 3 rules does indeed eat into the performance gains.
Im not sure if @fzipi has some thoughts on this, and how feasible it is to get the crs toolchain to do my suggestion which is:

  • check if a regex assembly has an include where the included file is over an arbitrary limit
  • if it is over the limit, split the file x times
  • for each part of the include, generate the entire regex
  • next, literally copy paste the rule x times, increment the rule id (a simple replace should work here)
  • finally, duplicate the tests for the specific rule id up to rule id + x
    Then, when it comes to testing, when testing modsecurity2 on httpd run the above. That way, we arent limiting ourselves performance wise (and just general rule tidiness and the annoyance of having duplicate rules) because of a limit in one specific setup.

@fzipi
Copy link
Member
fzipi commented May 26, 2025

Looks like a nice but for now complex addition to crs-toolchain :)

Also, dynamically generating X amount of rules seems dangerous if the IDs change 🤔 But I like the ideas here. Let's figure this out.

@TimDiam0nd
Copy link
Contributor Author

Understood. I will create a separate issue to track this, and in the meantime follow with @theseion's suggestion about manually splitting the regex for now.

@TimDiam0nd
Copy link
Contributor Author

Hey @theseion, just bumping this (and #4138) as i would like to have them merged prior to 4.15 being released.

TimDiam0nd and others added 10 commits May 29, 2025 09:40
Co-authored-by: Max Leske <250711+theseion@users.noreply.github.com>
Co-authored-by: Max Leske <250711+theseion@users.noreply.github.com>
Co-authored-by: Max Leske <250711+theseion@users.noreply.github.com>
Co-authored-by: Max Leske <250711+theseion@users.noreply.github.com>
Co-authored-by: Max Leske <250711+theseion@users.noreply.github.com>
Co-authored-by: Max Leske <250711+theseion@users.noreply.github.com>
@TimDiam0nd
Copy link
Contributor Author

@theseion all done, thanks for the suggestion.

Co-authored-by: Max Leske <250711+theseion@users.noreply.github.com>
Co-authored-by: Max Leske <250711+theseion@users.noreply.github.com>
@theseion
Copy link
Contributor

@TimDiam0nd there are still a couple of unresolved comments.

Co-authored-by: Max Leske <250711+theseion@users.noreply.github.com>
@TimDiam0nd
Copy link
Contributor Author

Should be done now i think

Co-authored-by: Max Leske <250711+theseion@users.noreply.github.com>
@theseion theseion added this pull request to the merge queue May 30, 2025
@theseion
Copy link
Contributor

Thanks @TimDiam0nd.

Merged via the queue into coreruleset:main with commit 5943791 May 30, 2025
6 checks passed
@TimDiam0nd TimDiam0nd deleted the 933151 branch May 30, 2025 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0