-
Notifications
You must be signed in to change notification settings - Fork 41
u3: adds new, page-oriented memory allocator #812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
joemfb
wants to merge
145
commits into
next/kelvin/409
Choose a base branch
from
jb/palloc
base: next/kelvin/409
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<
8000
task-lists disabled sortable>
pkova
added a commit
that referenced
this pull request
May 23, 2025
This PR is based on and targets #812. It also includes a copy of #825, as that fix is necessary for the relocated home road to not interfere with inner-road promotion. It implements "option 2" from #451 (comment). Moving the home road makes sense given #812, as it requires that the entire first page be reserved from the heap. Once the road home is moved, there's no reason for the snapshot to consist of two separage memory-image segments, as only the bottom of the loom needs to be preserved. This simplifies the snapshot system by removing much of the complicated, direction-switching redundancy in its implementation, and simplifies the handling of snapshots themselves as they are now a single file. Since these changes require a snapshot-system migration, I've taken the opportunity to add and verify a top-level checksum for all snapshot patch metadata.
pkova
added a commit
that referenced
this pull request
May 23, 2025
These changes were developed on top of #812, as part of a(nother) failed effort to develop leak-free unification of senior memory. But they stand alone, and deserve separate review. This PR fixes a bug wherein deeper addresses were not preferred when unifying two nouns on the same road. It also optimizes the unification implementation by using separate functions for normal and senior unification and removing branches. Finally, it enables home-road unification when the current road is the home road and there are no child roads, and adds basic unit tests for these scenarios.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR replaces the u3 allocator with a new, page-aware design (cribbed directly from phkmalloc). Opened as a draft for now pending a migration from the old allocator/snapshot.
The primary design consideration of phkmalloc is to minimize the number of pages that it touches. This aligns perfectly with the constraints of a persistent allocation arena, minimizing write-amplification when updating a snapshot. phkmalloc further ensures natural alignment at every allocation size up to the page size (all powers of two), and page alignment after that. These are stronger alignment guarantees than we require in any scenario, so all alignment is now implicit (other than the page-alignment of a new road), and all heap objects are aligned to at least 16 bytes (allowing an additional bit of pointer compression). All allocation metadata and freelists are stored out-of-band; allocated "as if" with the allocator itself. This shrinks our cells from 24 to 16 bytes and ensures that free pages are actually free, allowing us to store more data and further reducing churn in snapshot updates. Rounding small allocations up to power-of-two sizes reduces the overhead of searching free lists, allowing us to removing a performance hack from the old allocator that had significantly increased fragmentation. In practice, persistent state and snapshot have been observed to be roughly 20% smaller with this allocator.
The absence of extra space for inline metadata in our allocations (no more "boxes") dictates redesigns of our garbage collection mechanisms. Reference counts are now in "userspace": an ad-hoc part of the application data inside an allocation (conventionally, the first word) instead of allocator-level metadata. So our mark and sweep collector (actually two collectors, switched by
u3o_debug_ram
) requires that extra space be allocated on the side to track mark bits or recalculated refcounts. This has the advantageous effect that|mass
no longer dirties clean, persisted pages on the home road, making it much faster and cheaper (and therefore practical to run automatically). Similarly, our mark/compact collector (|pack) requires out-of-band relocation state. To minimize overhead, those relocations are not stored, but calculated from compact bitmaps in a design cribbed from the Clozure Common Lisp compiler by way of Factor.