-
Notifications
You must be signed in to change notification settings - Fork 372
Speed-up and cleanup objToJSON #615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
What sort of speedup are we looking at? Please can you share some benchmark scripts to demonstrate it? |
@hugovk I ran tests/benkmark.py with and without the change. The results are noisy, so I ran both sides 5 times and took max for each result. The highlights are:
So we got about 20% speedup in these 3 cases. The other cases are unchanged, modulo noise. I think this is pretty good, given that the new code is simpler and shorter. |
* Use PyDict_Next() to iterate over dicts. * Use macros to access lists, tuples, bytes. * Avoid calling PyErr_Occurred() if not necessary. * Fix a memory leak when encoding very large ints. * Delete dead and duplicate code. Also, * Raise TypeError if toDict() returns a non-dict instead of silently converting it to null.
Do not create a list of tuples with (converted key, value) upfront. Instead, convert keys and fetch values during iteration. Also, if sorting fails, preserve the original exception instead of overwriting it with a less informative ValueError. This is the same behavior as the standard library's json module.
fba64dc
to
5b9fd1e
Compare
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #615 +/- ##
==========================================
+ Coverage 91.68% 92.34% +0.65%
==========================================
Files 6 6
Lines 1949 1908 -41
==========================================
- Hits 1787 1762 -25
+ Misses 162 146 -16 ☔ View full report in Codecov by Sentry. |
@hugovk I pushed another small commit that optimizes encoding with sorted keys. Now sorted output is also about 20% faster and uses less memory:
|
@hugovk I'm going to go hit the release button unless you object? |
Go for it! Do we want a major release because of #614? I don't mind too much either way. |
Actually, I don't think I do now. It seems too out of typical use to call a breaking change. |
@bwoodsend Thanks! I assumed you were going to make a new major version because of #614, so I included
in this PR, which is also technically a breaking change. We should probably at least highlight the change in release notes. |
Indeed, that change unexpectedly hit us and broke our CI pipeline. The ultimate error is ours, but it did take a bit of head-scratching to figure out what happened, as this is not currently called out in the release notes that I've been able to find. |
Ugh, sorry guys. I've added a breaking section to the release notes although that doesn't fix the fact that I broke semantic versioning. We could yank 5.9.0 from PyPI and rerelease as 6.0.0 although I'd definitely consider that to be too far. |
@jamadden for my information, were you using this as a feature, or was this just a bug that went unnoticed? I was considering to allow toDict() to return None. I only didn't do this because there were no tests or documentation for this behavior. |
@eltoder In our case, in the one scenario we've identified so far, it was basically accidental. A |
Also,