8000 u3: adds new, page-oriented memory allocator by joemfb · Pull Request #812 · urbit/vere · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

u3: adds new, page-oriented memory allocator #812

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 145 commits into
base: next/kelvin/409
Choose a base branch
from
Draft

Conversation

joemfb
Copy link
Collaborator
@joemfb joemfb commented May 6, 2025
< 8000 task-lists disabled sortable>

This PR replaces the u3 allocator with a new, page-aware design (cribbed directly from phkmalloc). Opened as a draft for now pending a migration from the old allocator/snapshot.

The primary design consideration of phkmalloc is to minimize the number of pages that it touches. This aligns perfectly with the constraints of a persistent allocation arena, minimizing write-amplification when updating a snapshot. phkmalloc further ensures natural alignment at every allocation size up to the page size (all powers of two), and page alignment after that. These are stronger alignment guarantees than we require in any scenario, so all alignment is now implicit (other than the page-alignment of a new road), and all heap objects are aligned to at least 16 bytes (allowing an additional bit of pointer compression). All allocation metadata and freelists are stored out-of-band; allocated "as if" with the allocator itself. This shrinks our cells from 24 to 16 bytes and ensures that free pages are actually free, allowing us to store more data and further reducing churn in snapshot updates. Rounding small allocations up to power-of-two sizes reduces the overhead of searching free lists, allowing us to removing a performance hack from the old allocator that had significantly increased fragmentation. In practice, persistent state and snapshot have been observed to be roughly 20% smaller with this allocator.

The absence of extra space for inline metadata in our allocations (no more "boxes") dictates redesigns of our garbage collection mechanisms. Reference counts are now in "userspace": an ad-hoc part of the application data inside an allocation (conventionally, the first word) instead of allocator-level metadata. So our mark and sweep collector (actually two collectors, switched by u3o_debug_ram) requires that extra space be allocated on the side to track mark bits or recalculated refcounts. This has the advantageous effect that |mass no longer dirties clean, persisted pages on the home road, making it much faster and cheaper (and therefore practical to run automatically). Similarly, our mark/compact collector (|pack) requires out-of-band relocation state. To minimize overhead, those relocations are not stored, but calculated from compact bitmaps in a design cribbed from the Clozure Common Lisp compiler by way of Factor.

joemfb added 30 commits January 16, 2025 09:28
pkova added a commit that referenced this pull request May 23, 2025
These issues came up while testing #812 under Asan and Ubsan. The +bex
jet had already been partially fixed in #764. There are many more of
these undefined left shifts in the codebase. Those should be
systematically rectified, maybe as part of #794.
This PR is based on and targets #812. It also includes a copy of #825,
as that fix is necessary for the relocated home road to not interfere
with inner-road promotion. It implements "option 2" from
#451 (comment).

Moving the home road makes sense given #812, as it requires that the
entire first page be reserved from the heap. Once the road home is
moved, there's no reason for the snapshot to consist of two separage
memory-image segments, as only the bottom of the loom needs to be
preserved. This simplifies the snapshot system by removing much of the
complicated, direction-switching redundancy in its implementation, and
simplifies the handling of snapshots themselves as they are now a single
file.

Since these changes require a snapshot-system migration, I've taken the
opportunity to add and verify a top-level checksum for all snapshot
patch metadata.
pkova added a commit that referenced this pull request May 23, 2025
These changes were developed on top of #812, as part of a(nother) failed
effort to develop leak-free unification of senior memory. But they stand
alone, and deserve separate review.

This PR fixes a bug wherein deeper addresses were not preferred when
unifying two nouns on the same road. It also optimizes the unification
implementation by using separate functions for normal and senior
unification and removing branches. Finally, it enables home-road
unification when the current road is the home road and there are no
child roads, and adds basic unit tests for these scenarios.
@joemfb joemfb changed the base branch from develop to next/kelvin/409 May 23, 2025 16:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0