8000 Move NameRef and SymbolRef kind bits to end for better varint packing. by jvilk-stripe · Pull Request #4012 · sorbet/sorbet · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Move NameRef and SymbolRef kind bits to end for better varint packing. #4012

8000
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 19, 2021

Conversation

jvilk-stripe
Copy link
Collaborator
@jvilk-stripe jvilk-stripe commented Feb 18, 2021

Move NameRef and SymbolRef kind bits to end for better varint packing.

Motivation

We serialize u4s using variable-length integers (varints) to reduce the size of the payload baked into Sorbet and stored on disk in the cache. Varints are really nice for small numbers (numbers up to 2^7-1 fit in 1 byte!) but have overhead for large numbers (28+ bits require 5 bytes since it stores 7 bits of data per byte).

We store SymbolRef/NameRefs as u4s in the cache using their rawId, which encodes their kind and their index in that kind's table. Storing the kind bits in the upper bits of rawId ensures that ~all of these references are stored inefficiently in the cache (5 bytes instead of 1 for small IDs!). I hadn't thought of this downside when initially engineering them this way.

A secondary consequence of this inefficiency is that it's slower to deserialize and serialize an integer from 5 bytes rather than 1 or 2. I am very interested in optimizing deserialization for my compressed AST work, which is the primary motivation of this work.

Fortunately, the solution is simple: Move the kind bits to the end of the rawId. This solution shrinks payload by ~4% (0.1MB) and Stripe's cache directory (which contains ASTs) by ~2%. I do not see any noticeable changes in end-to-end runtime.

Test plan

See included automated tests.

Ensures that varint compression in cache is more effective since the top bits will now be 0 for small IDs.
Ensures better varint packing.
@jvilk-stripe
Copy link
Collaborator Author

We have a policy of testing changes to Sorbet against Stripe's codebase before
merging them. I've kicked off a test run for the current PR. When the build
finishes, I'll share with you whether or how it failed. Thanks!

Stripe employees can see the build results here:

https://go/builds/bui_IyHuV66Tgiv87k
https://go/builds/bui_IyHuTbGRmte7Gd

@jvilk-stripe jvilk-stripe changed the title WIP: Move NameRef and SymbolRef kind bits to end for better varint packing. Move NameRef and SymbolRef kind bits to end for better varint packing. Feb 19, 2021
@jvilk-stripe jvilk-stripe changed the title Move NameRef and SymbolRef kind bits to end for better varint packing. WIP: Move NameRef and SymbolRef kind bits to end for better varint packing. Feb 19, 2021
The order of the symbols changed probably because sort order changed.
@jvilk-stripe jvilk-stripe changed the title WIP: Move NameRef and SymbolRef kind bits to end for better varint packing. Move NameRef and SymbolRef kind bits to end for better varint packing. Feb 19, 2021
@jvilk-stripe jvilk-stripe marked this pull request as ready for review February 19, 2021 17:44
@jvilk-stripe jvilk-stripe requested a review from a team as a code owner February 19, 2021 17:44
@jvilk-stripe jvilk-stripe requested review from froydnj and removed request for a team February 19, 2021 17:44
Copy link
Contributor
@froydnj froydnj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I would really like it if the test expectation changes were some combination of investigated/understood/fixed before this landed.

@jvilk-stripe jvilk-stripe merged commit b47f0c5 into master Feb 19, 2021
@jvilk-stripe jvilk-stripe deleted the jvilk/nameref-symbolref-bits-to-end branch February 19, 2021 20:00
KaanOzkan pushed a commit to Shopify/sorbet that referenced this pull request Nov 18, 2021
* Add some meager docs to sorbet#4012

Just hit this; filling out the error to help answer questions in the future.

* Update error-reference.md

* Fix prettier

* Improve example

Co-authored-by: Jake Zimmerman <zimmerman.jake@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0