kernel/process_standard: ensure kernel_memory_break align & cleanup #3556

lschuermann · 2023-07-21T15:44:33Z

Pull Request Overview

This commit addresses issue #1739 by ensuring that the kernel memory break is always well-aligned to a word-boundary, regardless of the MPU's internal alignment constraints.

It further cleans up the memory allocation code. Tracing the previous slices and pointers through the previous code was rather difficult. In an effort to simply this logic, the returned memory is now continuously sub-sliced until all individual memory sub-sections have been isolated into their own slices. This avoids indexing into larger slices, encompassing multiple logically-distinct memory allocations.

It further renames the allocation returned by allocate_app_memory_region to no longer have "app" in its name, as this can be mistaken for app-accessible memory. Instead, it simply calls it allocated_memory, which is then sub-sliced into app_accessible_memory and allocated_kernel_memory slices.

Visually, the memory is now sliced as follows:

+-----------------------------------------------------------------
| remaining_memory
+----------------------------------------------------+------------
v                                                    v
+----------------------------------------------------+
| allocated_padded_memory                            |
+--+--
8000
-----------------------------------------------+
   v                                                 v
   +-------------------------------------------------+
   | allocated_memory                                |
   +-------------------------------------------------+
   v                                                 v
   +-----------------------+-------------------------+
   | app_accessible_memory | allocated_kernel_memory |
   +-----------------------+-------------------+-----+
                                               v
                             kernel memory break
                                                \---+/
                                                    v
                                      optional padding

Testing Strategy

Needs testing. How to best test this? Deliberately have the dummy MPU (()) return a non word-aligned kernel break?

TODO or Help Wanted

N/A

Documentation Updated

~~Updated the relevant files in /docs,~~ or no updates are required.

Formatting

Ran make prepush.

This commit addresses issue tock#1739 by ensuring that the kernel memory break is always well-aligned to a word-boundary, regardless of the MPU's internal alignment constraints. It further cleans up the memory allocation code. Tracing the previous slices and pointers through the previous code was rather difficult. In an effort to simply this logic, the returned memory is now continuously sub-sliced until all individual memory sub-sections have been isolated into their own slices. This avoids indexing into larger slices, encompassing multiple logically-distinct memory allocations. It further renames the allocation returned by `allocate_app_memory_region` to no longer have "app" in its name, as this can be mistaken for app-accessible memory. Instead, it simply calls it `allocated_memory`, which is then sub-sliced into `app_accessible_memory` and `allocated_kernel_memory` slices. Visually, the memory is now sliced as follows: +----------------------------------------------------------------- | remaining_memory +----------------------------------------------------+------------ v v +----------------------------------------------------+ | allocated_padded_memory | +--+-------------------------------------------------+ v v +-------------------------------------------------+ | allocated_memory | +-------------------------------------------------+ v v +-----------------------+-------------------------+ | app_accessible_memory | allocated_kernel_memory | +-----------------------+-------------------+-----+ v kernel memory break \---+/ v optional padding

bradjc

It seems like you should copy your diagram in to the code.

I assume you can run the process command on the pconsole with a null app before and after to ensure all this doesn't change anything?

bradjc · 2023-07-21T15:54:12Z

kernel/src/process_standard.rs

+        let initial_kernel_memory_size = grant_ptrs_offset
+            + Self::CALLBACKS_OFFSET
+            + Self::PROCESS_STRUCT_OFFSET
+            + core::mem::size_of::<usize>();


Why not just calculate the exact needed padding?

We don't know this ahead of time, it's up to the MPU where the end of the allocation is placed. We have three options:

Add an additional "allocation end" alignment constraint to allocate_app_memory_region, where we can then specify that we want the end of the region word-size aligned. This seems sub-optimal, as this is not even a kernel-invariant, but rather one of ProcessStandard.

Add core::mem::size_of::<usize>() - 1 to the initial kernel memory size, which is the maximum amount of padding we'll ever need. However, with our current MPU implementations, this will mean that we'll always end on an unaligned address, and always waste those 3 bytes.

Add core::mem::size_of::<usize>(). This is one byte more than required, but it would mean that an MPU implementation which naturally aligns regions to a word boundary places this padding also on a word boundary. We would have a slightly larger initial_kernel_memory_size than required, but wouldn't waste any of this memory, for instance when we do grant allocations.

We have precedent for the first option with grant: we request both the size and the alignment when creating a grant. I think it's reasonable to say that alignment is a fundamental operation for allocation.

But I'm not sure I feel strongly about changing the interface for this purpose.

I'm all for adding alignment to the interface, but what feels weird is to add an alignment requirement towards the end of a region. Usually the MPU performs alignment inserting padding in the front. In this case, we'd need the MPU to place the kernel region towards the end of the allocated section, then move it further towards the front to achieve alignment, while not colliding with the app's memory, and then passing back the allocated start address (which isn't even subject to that alignment) to the caller.

Apparently, even our current interface has sufficient complexity to implement it incorrectly (#3544), and adding all of the above constraints adds so many more edge-cases.

I don't like either, but for now I'd be leaning towards allocating another word of memory, which can be used by grants at runtime anyways.

Ok yes I agree. Given that we are allocating a larger region, it makes sense that the process standard impl just ensures it can both give the process enough accessible memory and make its grant bookkeeping aligned.

ppannuto

I think this is better than what we have currently, but doing a more thorough read through of the whole process creation I think TheRightWay(TM) would move more intelligence and work into the userspace_kernel_boundary.

Ultimately, alignment requirements derive from architecture specifications. To that end, all the little final padding, shifting, casting bits really should be in arch-specific implementations. As one example, initial_app_brk ultimately gets passed to userspace_kernel_boundary.app_brk where it's used as the initial process stack pointer. On ARM, this requires 8-byte alignment. Nothing enforces this*. On the flip side, technically ARM Cortex-M3—M7 actually allow for unaligned access (with degraded runtime performance as the HW silently does the alignment), but if some more practical architecture had looser requirements, we would bake padding we don't need into processes.

In the short term, I think we should merge this, including the current "extra word of padding for alignment if-needed" solution. In the longer run, I'd like to play with a slightly more compartmentalized and federated architecture for process creation—which, at minimum, might break the current 500-line monolith that is create() into more manageable, conceptual pieces.

I'm also skeptical that a "default implementation" for the MPU should exist, especially if the MPU is to be responsible for things like defining region sizes. Indeed, the real root cause IMO to the error reported on the original issue here is that the MPU region allocation did not follow rules for the currently executing architecture (which a default impl can't possibly do!). Maybe I'll see if it makes sense for that to go away as well.

*though it does happen to work out in our common case paths right now. It derives from min_process_memory_size, which comes UKB.initial_process_app_brk_size which is SVC_FRAME_SIZE which is 32 bytes. This is then added app_mpu_mem_start, which ultimately chains through a few MPU things to something aligned. Frankly, it's unlikely this wouldn't work out on real ARM implementations, but that isn't a case for hidden, cross-layer dependencies.

ppannuto · 2023-07-26T01:23:42Z

Also, possibly of interest, removing the clippy allows doesn't generate clippy warnings anymore (on master, or this branch).

bradjc · 2023-07-26T16:06:21Z

would move more intelligence and work into the userspace_kernel_boundary.

I'm skeptical, but, building on the idea that we need UKB.initial_process_app_brk_size might make this better.

ppannuto · 2023-07-31T20:51:41Z

Yeah, I'm not certain that UKB is the right way to do it either :/. I think we have a bunch of places where we assume alignment that in practice work out on all 'normal' systems, but host-based unit testing and folks exploring 64-bit arch's, etc are starting to expose / violate these assumptions.

I opened #3583 to flag/track the more substantial alignment issue.

I think we should merge this as-is as it fixes a real problem in the now and has some nice buffer management cleanups. I don't think this will be the final word here, but think the 'most right' fix is a much bigger undertaking.

bradjc

on master:

tock$ process bare_bones
𝐀𝐩𝐩: bare_bones   -   [Yielded]
 Events Queued: 0   Syscall Count: 1   Dropped Upcall Count: 0
 Restart Count: 0
 Last Syscall: Yield { which: 1, address: 0x20007000 }
 Completion Code: None
 Credential: None


 ╔═══════════╤══════════════════════════════════════════╗
 ║  Address  │ Region Name    Used | Allocated (bytes)  ║
 ╚0x20007800═╪══════════════════════════════════════════╝
             │ Grant Ptrs      120
             │ Upcalls         320
             │ Process         920
  0x200072B0 ┼───────────────────────────────────────────
             │ ▼ Grant           0
  0x200072B0 ┼───────────────────────────────────────────
             │ Unused
  0x20007020 ┼───────────────────────────────────────────
             │ ▲ Heap            ? |      ?               S
  ?????????? ┼─────────────────────────────────────────── R
             │ Data              ? |      ?               A
  ?????????? ┼─────────────────────────────────────────── M
             │ ▼ Stack           ? |      ?
  0x20007000 ┼───────────────────────────────────────────
             │ Unused
  0x20007000 ┴───────────────────────────────────────────
             .....
  0x00041000 ┬─────────────────────────────────────────── F
             │ App Flash      4016                        L
  0x00040050 ┼─────────────────────────────────────────── A
             │ Protected        80                        S
  0x00040000 ┴─────────────────────────────────────────── H

with this pr:

tock$ process bare_bones
𝐀𝐩𝐩: bare_bones   -   [Yielded]
 Events Queued: 0   Syscall Count: 1   Dropped Upcall Count: 0
 Restart Count: 0
 Last Syscall: Yield { which: 1, address: 0x20007000 }
 Completion Code: None
 Credential: None


 ╔═══════════╤══════════════════════════════════════════╗
 ║  Address  │ Region Name    Used | Allocated (bytes)  ║
 ╚0x20007800═╪══════════════════════════════════════════╝
             │ Grant Ptrs      120
             │ Upcalls         320
             │ Process         920
  0x200072B0 ┼───────────────────────────────────────────
             │ ▼ Grant           0
  0x200072B0 ┼───────────────────────────────────────────
             │ Unused
  0x20007020 ┼───────────────────────────────────────────
             │ ▲ Heap            ? |      ?               S
  ?????????? ┼─────────────────────────────────────────── R
             │ Data              ? |      ?               A
  ?????????? ┼─────────────────────────────────────────── M
             │ ▼ Stack           ? |      ?
  0x20007000 ┼───────────────────────────────────────────
             │ Unused
  0x20007000 ┴───────────────────────────────────────────
             .....
  0x00041000 ┬─────────────────────────────────────────── F
             │ App Flash      4016                        L
  0x00040050 ┼─────────────────────────────────────────── A
             │ Protected        80                        S
  0x00040000 ┴─────────────────────────────────────────── H
tock$

Looks good to me!

lschuermann · 2023-08-01T13:22:05Z

I just realized I never put the above diagram into the code. Sorry for invalidating your reviews (again) @bradjc @ppannuto.

lschuermann requested a review from ppannuto July 21, 2023 15:44

github-actions bot added the kernel label Jul 21, 2023

bradjc reviewed Jul 21, 2023

View reviewed changes

ppannuto previously approved these changes Jul 26, 2023

View reviewed changes

bradjc previously approved these changes Jul 31, 2023

View reviewed changes

bradjc added last-call Final review period for a pull request. P-Significant This is a substancial change that requires review from all core developers. labels Jul 31, 2023

kernel/process_standard: add diagram showing process memory slicing

a3fc554

lschuermann dismissed stale reviews from bradjc and ppannuto via a3fc554 August 1, 2023 13:21

bradjc approved these changes Aug 1, 2023

View reviewed changes

ppannuto approved these changes Aug 1, 2023

View reviewed changes

ppannuto added this pull request to the merge queue Aug 1, 2023

Merged via the queue into tock:master with commit f843d22 Aug 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

kernel/process_standard: ensure kernel_memory_break align & cleanup #3556

kernel/process_standard: ensure kernel_memory_break align & cleanup #3556

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kernel/process_standard: ensure kernel_memory_break align & cleanup #3556

kernel/process_standard: ensure kernel_memory_break align & cleanup #3556

Uh oh!

Conversation

Uh oh!

Pull Request Overview

Testing Strategy

TODO or Help Wanted

Documentation Updated

Formatting

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!