Performance improvements #225

oschwald · 2025-04-29T21:11:02Z

This pull request contains several performance improvements. Most of the gains
come from the first commit.

While working on this, I also noticed a bug in decodeInt when size is greater
than 0 and less than 4. I fixed this and added some test. If you are open to it,
I'd be happy to add all of the tests seen in the official readers for the type
decoding. The original decodeUint also had some issues with sizes between 4 and 8,
for instance.

I also wasn't sure why the reader converts big ints to strings. I matched this in
my changes, but it seems like it would be better to return them directly.

oschwald · 2025-04-29T21:11:29Z

Also, please let me know if you would prefer me to break this into multiple PRs.

runk · 2025-04-29T23:06:01Z

It looks great! Thanks for doing it 🙇

I noticed some minor formatting differences - could you please run npm run format?

I've raised #226 to have formatting check as part of CI pipeline, it'd be great if you'd check it

runk · 2025-04-29T23:07:24Z

src/decoder.ts

  private decodeString(offset: number, size: number) {
-    return this.db.slice(offset, offset + size).toString();


Nice. Not sure why I did it that way in first place.

oschwald · 2025-04-29T23:08:17Z

Woops, I missed the format script entirely. I'd be happy to run that on each commit. It probably makes sense for your PR to merged first as there are a couple of existing issues in master.

Before this change, 1 million random GeoLite City lookups took about 12 seconds. After this change, they take about 8 seconds.

This does not make a noticeable difference with City lookups given that they only have one array that generally only has one element.

slice was deprected in Node 16. Some sites suggest that subarray is also faster, but I didn't test it as none of the official databases use this type.

Although there is probalby some performance gain from this, it is quite minimal. The primary benefit is the reduction in code.

From the spec: > When storing a signed integer, fields shorter than the maximum byte > length are always positive. When the field is the maximum length, e.g., > 4 bytes for 32-bit integers, the left-most bit is the sign. A 1 is > negative and a 0 is positive.

oschwald · 2025-04-29T23:12:43Z

I force-pushed with the formatting fixes.

github-actions · 2025-04-29T23:18:24Z

🎉 This PR is included in version 2.2.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

runk · 2025-04-29T23:22:20Z

@oschwald FYI to trigger release, commit messages need to be formatted according to semantic release rules: https://github.com/semantic-release/semantic-release?tab=readme-ov-file#commit-message-format

I had to do https://github.com/runk/mmdb-lib/actions/runs/14743180230 to release your change

Not a big deal, but always nice when it's going out automatically

oschwald · 2025-04-29T23:30:30Z

Thanks. I'll try to remember that when committing to your repos in the future.

What do you think about adding a full suite of decoder tests similar to these in the PHP reader? I suspect there may be a couple of other bugs lingering for some of the edge cases, e.g., long strings. There aren't test databases for all these cases as some of them would end up being rather large.

runk · 2025-04-29T23:48:07Z

More tests like those are always great 👍

oschwald force-pushed the greg/perf-improvements branch from 3fa359b to c9fc480 Compare April 29, 2025 21:29

runk reviewed Apr 29, 2025

View reviewed changes

oschwald added 5 commits April 29, 2025 16:10

Do not create extra array when decoding string

dbab38f

Before this change, 1 million random GeoLite City lookups took about 12 seconds. After this change, they take about 8 seconds.

Pre-allocate array rather than growing it

640bc73

This does not make a noticeable difference with City lookups given that they only have one array that generally only has one element.

Replace deprecated slice with subarray

5386bf8

slice was deprected in Node 16. Some sites suggest that subarray is also faster, but I didn't test it as none of the official databases use this type.

Use readUInt*BE functions

1e8385f

Although there is probalby some performance gain from this, it is quite minimal. The primary benefit is the reduction in code.

oschwald force-pushed the greg/perf-improvements branch from c9fc480 to 1218d66 Compare April 29, 2025 23:12

runk approved these changes Apr 29, 2025

View reviewed changes

runk merged commit d1858ee into runk:master Apr 29, 2025
4 checks passed

oschwald deleted the greg/perf-improvements branch April 29, 2025 23:15

runk added a commit that referenced this pull request Apr 29, 2025

feat: release performance improvements from #225

0a287f2

github-actions bot added the released label Apr 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance improvements #225

Performance improvements #225

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

		private decodeString(offset: number, size: number) {
		return this.db.slice(offset, offset + size).toString();

Performance improvements #225

Performance improvements #225

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!