8000 feat: update `java-errors.data` by Xhoenix · Pull Request #4113 · coreruleset/coreruleset · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

feat: update java-errors.data #4113

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jun 24, 2025
Merged

Conversation

Xhoenix
Copy link
Member
@Xhoenix Xhoenix commented May 4, 2025

Closes #4073.

@Xhoenix Xhoenix added the release:new-detection In this PR we introduce a new detection label May 4, 2025
Copy link
Contributor
github-actions bot commented May 4, 2025

📊 Quantitative test results for language: eng, year: 2023, size: 10K, paranoia level: 1:
🚀 Quantitative testing did not detect new false positives

@Xhoenix Xhoenix requested a review from a team May 4, 2025 07:18
@Xhoenix
Copy link
Member Author
Xhoenix commented May 23, 2025

Ping!

@Xhoenix Xhoenix requested review from a team and removed request for a team May 29, 2025 12:40
@Xhoenix
Copy link
Member Author
Xhoenix commented May 30, 2025

Ping @fzipi

Copy link
Member
@fzipi fzipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Xhoenix What do we miss by just using "bare" class names instead of prefixing with additional words?

@Xhoenix
Copy link
Member Author
Xhoenix commented May 30, 2025

I don't see much difference, as we're including the "exception name".

@fzipi
Copy link
Member
fzipi commented May 30, 2025

So if you happen to have a site that shows java documentation, you surely need to disable this. By adding extra prefixes, wouldn't that prevent some basic use cases? E.g. previously having the class java.lang. this was more error oriented.

Can you add this concept to the prompt you used for generating this list, to see if we get additional strings to match?

@Xhoenix
Copy link
Member Author
Xhoenix commented May 30, 2025

If you check line 2 and 50, you'll see the errors are similar to the ones I've added. I don't see any reason for FPs, unless a site shows java documentation itself, in which case normal java code will also be suspect to FPs.

@fzipi
Copy link
Member
fzipi commented Jun 3, 2025

Using the same tool you used for generate this, I get this:

class\s+(java|javax|org|com)...Exception → Matches fully qualified class names like class java.lang.NullPointerException.

Exception in thread → Matches common Java error preamble.

at (java|javax|org|com)... → Matches stack trace elements.

I still think adding the class and at as prefixes would gives us better false positve treatment. What do you think @theseion?

@theseion
Copy link
Contributor
theseion commented Jun 3, 2025

I think it's a good idea. Adding prefixes is simple with regex-assembly.

@Xhoenix
Copy link
Member Author
Xhoenix commented Jun 4, 2025

Adding prefixes is simple with regex-assembly.

We're still waiting for the results on performance comparison between pmFromFile and regex.

@theseion
Copy link
Contributor
theseion commented Jun 4, 2025

For this case the difference will likely be negligible since the list is rather short.

@Xhoenix
Copy link
Member Author
Xhoenix commented Jun 4, 2025

Actually, there are a lot of exceptions: https://docs.oracle.com/en/java/javase/21/docs/api/search.html?q=exception and errors: https://docs.oracle.com/en/java/javase/21/docs/api/search.html?q=error, and I've only scratched the surface.

@theseion
Copy link
Contributor
theseion commented Jun 5, 2025

But you don't have to list every exception. Since you're using a regular expression, you can shorten, e.g., java\.lang\.[a-z]+Exception.

@Xhoenix
Copy link
Member Author
Xhoenix commented Jun 14, 2025

@fzipi @theseion , should we wait for performance comparison details between pmFromFile and regex, as discussed in meeting, or proceed with implementing regex over pmFromFile?

@theseion
Copy link
Contributor

I'd proceed. I don't see that analysis being done any time soon.

Xhoenix added 5 commits June 19, 2025 08:14
< 8000 input type="hidden" name="oid" value="9b248cb0285902f0060e11ebebbf2b6011bc3c30" autocomplete="off" data-targets="batch-deferred-content.inputs" />
@fzipi
Copy link
Member
fzipi commented Jun 19, 2025

@Xhoenix Let's switch to regex then?

@Xhoenix
Copy link
Member Author
Xhoenix commented Jun 19, 2025

Waiting for review.

@fzipi
Copy link
Member
fzipi commented Jun 19, 2025

Looks good, tests are failing.

Xhoenix and others added 2 commits June 20, 2025 10:59
Co-authored-by: Max Leske <250711+theseion@users.noreply.github.com>
@fzipi fzipi requested review from theseion and fzipi June 21, 2025 12:54
Copy link
Member
@fzipi fzipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Way more succint. Will wait for @theseion approval, but LGTM.

@theseion theseion added this pull request to the merge queue Jun 24, 2025
@theseion
Copy link
Contributor

Thanks @Xhoenix!

Merged via the queue into coreruleset:main with commit cc1a72e Jun 24, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release:new-detection In this PR we introduce a new detection
Projects
None yet
Development

Successfully merging this pull request may close these issues.

update java-errors.data
3 participants
0