-
Notifications
You must be signed in to change notification settings - Fork 12
Performance improvements #225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Also, please let me know if you would prefer me to break this into multiple PRs. |
3fa359b
to
c9fc480
Compare
It looks great! Thanks for doing it 🙇 I noticed some minor formatting differences - could you please run I've raised #226 to have formatting check as part of CI pipeline, it'd be great if you'd check it |
private decodeString(offset: number, size: number) { | ||
return this.db.slice(offset, offset + size).toString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Not sure why I did it that way in first place.
Woops, I missed the format script entirely. I'd be happy to run that on each commit. It probably makes sense for your PR to merged first as there are a couple of existing issues in |
Before this change, 1 million random GeoLite City lookups took about 12 seconds. After this change, they take about 8 seconds.
This does not make a noticeable difference with City lookups given that they only have one array that generally only has one element.
slice was deprected in Node 16. Some sites suggest that subarray is also faster, but I didn't test it as none of the official databases use this type.
Although there is probalby some performance gain from this, it is quite minimal. The primary benefit is the reduction in code.
From the spec: > When storing a signed integer, fields shorter than the maximum byte > length are always positive. When the field is the maximum length, e.g., > 4 bytes for 32-bit integers, the left-most bit is the sign. A 1 is > negative and a 0 is positive.
c9fc480
to
1218d66
Compare
I force-pushed with the formatting fixes. |
🎉 This PR is included in version 2.2.0 🎉 The release is available on: Your semantic-release bot 📦🚀 |
@oschwald FYI to trigger release, commit messages need to be formatted according to semantic release rules: https://github.com/semantic-release/semantic-release?tab=readme-ov-file#commit-message-format I had to do https://github.com/runk/mmdb-lib/actions/runs/14743180230 to release your change Not a big deal, but always nice when it's going out automatically |
Thanks. I'll try to remember that when committing to your repos in the future. What do you think about adding a full suite of decoder tests similar to these in the PHP reader? I suspect there may be a couple of other bugs lingering for some of the edge cases, e.g., long strings. There aren't test databases for all these cases as some of them would end up being rather large. |
More tests like those are always great 👍 |
This pull request contains several performance improvements. Most of the gains
come from the first commit.
While working on this, I also noticed a bug in decodeInt when size is greater
than 0 and less than 4. I fixed this and added some test. If you are open to it,
I'd be happy to add all of the tests seen in the official readers for the type
decoding. The original decodeUint also had some issues with sizes between 4 and 8,
for instance.
I also wasn't sure why the reader converts big ints to strings. I matched this in
my changes, but it seems like it would be better to return them directly.