8000 Rework of Digest HIL by phil-levis · Pull Request #3041 · tock/tock · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Rework of Digest HIL #3041

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 47 commits into from
Jun 23, 2022
Merged

Rework of Digest HIL #3041

merged 47 commits into from
Jun 23, 2022

Conversation

phil-levis
Copy link
Contributor
@phil-levis phil-levis commented May 11, 2022

Pull Request Overview

This pull request reworks the Digest HIL, introducing four major changes:

  1. It allows a Digest to operate both on mutable and immutable data.This is so that one can calculate digests on read-only data (e.g., process binaries), as is needed for checking cryptographic credentials of processes without copying them into RAM first. Digest now has two methods to add data to a digest, add_data and add_mut_data. The callback client correspondingly has two callbacks add_data_done and add_mut_data_done.
  2. To support 1) above, it changes LeasableBuffer to take a &[u8] and adds a LeasableMutableBuffer which takes a &mut [u8].
  3. A call to add_data and add_mut_data is expected to add all of the data that is in the active slice of the corresponding LeasableBufffer or LeasableMutableBuffer. Accepting less is an error condition. Correspondingly, a successful call to one of these methods now returns Ok(()): the prior version returned Ok(usize) to return how many bytes would be added.
  4. It changes the calls and callbacks of Digest to use only LeasableBuffer and LeasableMutableBuffer rather than using both these types as well as slices. This allows an error condition to indicate how much data is remaining to be added.

Testing Strategy

This pull request has not been tested yet.

TODO or Help Wanted

This needs to be tested on OpenTitan.

Documentation Updated

  • Updated the relevant files in /docs, or no updates are required.

Formatting

  • Ran make prepush.

@github-actions github-actions bot added HIL This affects a Tock HIL interface. kernel sam4l Change pertains to the SAM4L MCU. WG-OpenTitan In the purview of the OpenTitan working group. labels May 11, 2022
@phil-levis phil-levis changed the title Digest imut Add Digest and LeasableBuffer traits that take non-mutable data May 11, 2022
@phil-levis phil-levis changed the title Add Digest and LeasableBuffer traits that take non-mutable data Add Digest traits and a LeasableBuffer that take non-mutable data May 11, 2022
@phil-levis
Copy link
Contributor Author

There was a lot of discussion on the core call today; there's general agreement that having two traits is undesirable, but there is not yet a better solution. So we agreed to discuss possible better approaches here on this PR before moving forward.

@hudson-ayers
Copy link
Contributor

I spent some time today thinking about a different design for asynchronously passing buffers to hardware that I think resolves some of the issues here. This design is partially inspired by the DMA design described here: https://docs.rust-embedded.org/embedonomicon/dma.html , but ours has the added challenge that we cannot rely on an assumption of futures / generators to simplify waiting on results, and instead have to support an event-driven state machine.

Step 1: Peripheral functions that currently take &'static mut [u8] despite not needing to actually mutate the underlying buffer are changed to be generic (just the function, not the entire peripheral) and instead take B: AsRef<[u8]> + 'static

//! kernel/src/hil/uart.rs
trait Transmit {
    fn transmit_buffer<B: AsRef<[u8]> + 'static>(
        &self,
        tx_buffer: B, // used to be &'static mut [u8]
        tx_len: usize,
    ) -> Result<BufferHandle<B>, (ErrorCode, B)>; // Notice different return type!
}

This is supported by a new type, BufferHandle, which knows the exact type of buffer that was passed to the underlying operation:

pub trait BufferHandleTrait {
    fn transfer_done(&self);
}

pub struct BufferHandle<B: AsRef<[u8]> + 'static> {
    buffer: B,
    finished: Cell<bool>,
}

impl<B: AsRef<[u8]> + 'static> BufferHandle {
    fn get_buffer(&self) -> Option<B> {
        if self.finished.get() {
            Some(B)
        } else {
            None
        }
    }
}

impl<B: AsRef<[u8]> + 'static> BufferHandleTrait for BufferHandle<B> {
    fn transfer_done(&self) {
        self.finished.set(true);
    }
}

The underlying code that actually passes the buffer to hardware (e.g. for a DMA operation) passes the address / length to the underlying hardware, then stashes the buffer (whatever its concrete type) in the buffer handle. The buffer handle ensures that the actual (Rust) buffer is not made available until its callback is called indicating the underlying hardware no longer will be accessing the buffer.

//! dma.rs
// sam4l transmit_buffer calls dma.prepare_transfer() to actually send the buffer, so this generic component is just forwarded
    pub fn prepare_transfer<B: AsRef<[u8]> + 'static>(
        &self,
        pid: DMAPeripheral,
        buf: B,
        mut len: usize,
    ) -> DMAHandle<B> {
        let maxlen = buf.as_ref().len()
            / match self.width.get() {
                DMAWidth::Width8Bit /*  DMA is acting on bytes     */ => 1,
                DMAWidth::Width16Bit /* DMA is acting on halfwords */ => 2,
                DMAWidth::Width32Bit /* DMA is acting on words     */ => 4,
            };
        len = cmp::min(len, maxlen);
        self.registers
            .mr
            .write(Mode::SIZE.val(self.width.get() as u32));

        self.registers.psr.set(pid);
        self.registers
            .marr
            .write(MemoryAddressReload::MARV.val(buf.as_ref()[0] as *const u8 as u32));
        self.registers
            .tcrr
            .write(TransferCounter::TCV.val(len as u32));

        self.registers.ier.write(Interrupt::TRC::SET);

        BufferHandle { buf, finished: Cell::new(false) }
    }

Now, we need a way to indicate to the BufferHandle that it can unlock once the transfer is done. Let's do this using the new BufferHandleTrait above, because we cannot allow our underlying peripheral to be generic over a parameter for which there will be multiple types used, so this access needs to occur via a vtable:

//! dma.rs
struct DMAChannel {
    // ... stuff
    // we used to store buf: &'static mut [u8] here, no more
    client: &'static dyn DMAClient, // same as before
    handle: &'static dyn BufferHandleTrait,
}

impl DMAChannel {
    pub fn handle_interrupt(&self) {
        self.registers
            .idr
            .write(Interrupt::TERR::SET + Interrupt::TRC::SET + Interrupt::RCZ::SET);
        let channel = self.registers.psr.get();

        self.client.map(|client| {
            client.transfer_done(channel); // normal client notification mechanism
        });
        self.handle.map(|handle| {
            handle.transfer_done(); // indicate to handle buffer can be safely retrieved
        });
    }
}

Finally, the capsule-level code that is calling the HIL is responsible for holding onto the handle, and retrieving its original buffer once the operation is complete:

//! console.rs (I haven't thought about virtualization yet, lets assume direct access)
struct Console<U:<
8000
/span> Uart + 'static> {
    uart: &'static U,
    rw_buffer: TakeCell<&'static mut [u8]>, // for printing from ram buf
    ro_buffer: TakeCell<&'static [u8]>, //for printing from flash buf
    rw_handle: OptionalCell<BufferHandle<&'static mut [u8]>>, 
    ro_handle: OptionalCell<BufferHandle<&'static [u8]>>,
}

fn command(cmd_num: u32) {
    match cmd_num {
        0 /* print from RAM buffer */ => {
            let rw_handle = self.uart.transmit_buffer(self.static_rw_buf.take());
            self.rw_handle.set(Some(rw_handle)); // store handle so we can retrieve buffer upon tx_complete
        }
        1 /* print from flash buffer */ => {
            let ro_handle = self.uart.transmit_buffer(self.static_ro_buf.take());
            self.ro_handle.set(Some(ro_handle)); // store handle so we can retrieve buffer upon tx_complete
        }
    }
}

impl UartClient for Console {
    transmit_complete() {
        if self.rw_handle.get().is_some() {
            self.rw_buffer.set(self.rw_handle.get().unwrap().get_buffer().unwrap());
        } else if self.ro_handle.get().is_some() {
            self.ro_buffer.set(self.ro_handle.get().unwrap().get_buffer().unwrap());
       }
    }
}

The beauty here is that only the top-level of code has to think about the fact that there will be multiple different types of buffers passed to the underlying peripheral, the underlying peripheral code just cares that whatever type it was passed can be used as a static read-only slice. Also, now the upper-layer code is responsible for holding onto buffers until DMA completes, but the interface of BufferHandle prevents any unsoundness resulting from the upper-layer code accessing the buffer before the dma completes, because the buffer is not actually made accessible until the DMA operation is finished.

Thoughts welcome!

@kupiakos @phil-levis

@lschuermann
Copy link
Member
lschuermann commented May 19, 2022

@hudson-ayers I think this sounds like a great design and might have the ability to solve many of the issues we face across these interfaces. Here's a few thoughts (I have not run this through an actual compiler, so might be wrong):

  1. I'm not sure whether we can create such a BufferHandle down in the driver. If I understand this proposal correctly, at its core is the fact that both the capsule and driver hold onto a shared ('static) and, in case of the driver, dyn reference of this wrapper type, such that each can keep hold of and use it across call stacks. However, if this is created in the call to start an operation with the driver we cannot then move the resulting BufferHandle into some cell (e.g. OptionalCell) and have the driver still be able to hold a 'static reference to it. More generally, once we have 'static dyn (and deliberately not mut!) reference to this BufferHandle struct stored in the driver once, storing it in a type such as OptionalCell will be impossible as a reference to its contents fundamentally cannot be 'static but is limited in scope, for instance through a closure.

    Instead, we should be able to just have a static number of &'static BufferHandles in the capsules. These BufferHandles can then be reused for operations with the respective peripherals.

  2. Given that the capsule will ultimately hold either ownership or a reference to a BufferHandle, there's nothing preventing it from calling transfer_done. This is dangerous, as it would allow it to get access to a (potentially mutable) buffer which might still be subject to a DMA operation. Thus we should protect this method using either unsafe or a capability.

  3. I understand this type is only to be used for DMA operations which do not mutate the memory in question, right? Otherwise, to solve the actual DMA unsoundness we have in our current approach (Violation of Rust's aliasing rules with DMA operations #2637), we would need to change the BufferHandle to not hold onto a slice, but the raw pointer and length instead.

Given the above thoughts (especially 1 + 2), we might actually get away without using monomorphism in the driver methods at all. A driver might really only need to have acess to a &'static dyn BufferHandleTrait, making it entirely analog to the way clients work currently. The capsule can then use its &'static BufferHandle<B> reference to exchange B at will as long as no operation is ongoing, whereas &'static dyn BufferHandleTrait can be used to get a &'a [u8] reference for synchronous operations, a (*const u8, usize) for asynchronous operations, and use it to signal the unsafe transfer_done().

Generally, I am really excited to see this! I will try to build a PoC of this slightly modified version, happy to collaborate on this @hudson-ayers!

@hudson-ayers
Copy link
Contributor
8000
  1. I'm not sure whether we can create such a BufferHandle down in the driver. If I understand this proposal correctly, at its core is the fact that both the capsule and driver hold onto a shared ('static) and, in case of the driver, dyn reference of this wrapper type, such that each can keep hold of and use it across call stacks. However, if this is created in the call to start an operation with the driver we cannot then move the resulting BufferHandle into some cell (e.g. OptionalCell) and have the driver still be able to hold a 'static reference to it. More generally, once we have 'static dyn (and deliberately not mut!) reference to this BufferHandle struct stored in the driver once, storing it in a type such as OptionalCell will be impossible as a reference to its contents fundamentally cannot be 'static but is limited in scope, for instance through a closure.
    Instead, we should be able to just have a static number of &'static BufferHandles in the capsules. These BufferHandles can then be reused for operations with the respective peripherals.

I have not fully tested an implementation of this in Tock yet, just played around with some toy code in the playground. I think I understand that the issue here is that the BufferHandle we are creating on the stack would not have a static lifetime, so the underlying dma code cannot store a static reference to it -- that is definitely a problem. I am not sure I understand your proposed solution though? Would an empty (static) BufferHandle have to also be passed into downcalls so that it could be filled with the actual buffer reference in the upcall?

  1. Given that the capsule will ultimately hold either ownership or a reference to a BufferHandle, there's nothing preventing it from calling transfer_done. This is dangerous, as it would allow it to get access to a (potentially mutable) buffer which might still be subject to a DMA operation. Thus we should protect this method using either unsafe or a capability.

Yeah I agree there should be a capability for calls to transfer_done()

  1. I understand this type is only to be used for DMA operations which do not mutate the memory in question, right? Otherwise, to solve the actual DMA unsoundness we have in our current approach (Violation of Rust's aliasing rules with DMA operations #2637), we would need to change the BufferHandle to not hold onto a slice, but the raw pointer and length instead.

Yes, this was just designed to address read-only DMA operations. For DMA operations that write into the passed buffer, there is already an implicit requirement that the DMA be operating on & 'static mut [u8].

Given the above thoughts (especially 1 + 2), we might actually get away without using monomorphism in the driver methods at all. A driver might really only need to have acess to a &'static dyn BufferHandleTrait, making it entirely analog to the way clients work currently. The capsule can then use its &'static BufferHandle<B> reference to exchange B at will as long as no operation is ongoing, whereas &'static dyn BufferHandleTrait can be used to get a &'a [u8] reference for synchronous operations, a (*const u8, usize) for asynchronous operations, and use it to signal the unsafe transfer_done().

I am not sure I understand the suggestion here, where in my design did I suggest monomorphization in the userspace driver code?

Generally, I am really excited to see this! I will try to build a PoC of this slightly modified version, happy to collaborate on this @hudson-ayers!

I would love to see a PoC of this! I am definitely happy to give feedback on what you come up with

@lschuermann
Copy link
Member

@hudson-ayers and I just talked about this. Indeed it appears to be an issue if we were to allocate the BufferHandle on the stack, or anywhere non-'static really. Thus we'd need to change it to be essentially a lockable optional cell: a capsule holds on to some &'static BufferHandles where it can swap out buffers freely. Once this is passed down to a peripheral driver, it will be locked, preventing the capsule from changing the buffer contents. A device driver must, after the respective operation completes, unlock this buffer before invoking the client callback, returning access to the buffer for the capsule.

I am not sure I understand the suggestion here, where in my design did I suggest monomorphization in the userspace driver code?

I was talking about monomorphization in the kernel's peripheral drivers. While the above restriction does make it possible for these drivers to accept a &'static dyn BufferHandle and avoid monomorphization, as @hudson-ayers pointed out, it may in many cases be beneficial to use monomorphization in order to allow inlining and compiler optimizations across call invocations (e.g. for buffer length checks). The good thing is that is can be decided on a per-driver basis, based on empirical measurments.

@bradjc
Copy link
Contributor
bradjc commented May 19, 2022

Do I have this right:

Create a new type HardwareBuffer that stores something that can provide a &[u8]. The HardwareBuffer then ensures that while the hardware has access to the buffer nothing else in rust can access the buffer.

?

@lschuermann
Copy link
Member

Create a new type HardwareBuffer that stores something that can provide a &[u8]. The HardwareBuffer then ensures that while the hardware has access to the buffer nothing else in rust can access the buffer.

Rather that the buffer cannot be swapped or mutated while the hardware is accessing it, and the peripheral is only handed a trait object reference to it, in order to abstract over multiple buffer types.

@bradjc
Copy link
Contributor
bradjc commented May 20, 2022

Create a new type HardwareBuffer that stores something that can provide a &[u8]. The HardwareBuffer then ensures that while the hardware has access to the buffer nothing else in rust can access the buffer.

Rather that the buffer cannot be swapped or mutated while the hardware is accessing it, and the peripheral is only handed a trait object reference to it, in order to abstract over multiple buffer types.

What prevents BufferHandle from mutating the buffer (while HW has access to it)? It feels like something here is still fundamentally unsafe, but Hudson's example isn't marking anything as unsafe.

@phil-levis
Copy link
Contributor Author
phil-levis commented May 20, 2022

What prevents BufferHandle from mutating the buffer (while HW has access to it)? It feels like something here is still fundamentally unsafe, but Hudson's example isn't marking anything as unsafe.

Hudson and I talked about this yesterday -- I agree, it seems problematic that both the creator and the holder of the handle can have a reference to the buffer. There's effectively a run-time check that you call transfer_done before you can get the buffer back, and that the creator (lower layer) doesn't hold on to the memory past them. One thought we discussed on making the borrow checker happy was using raw pointers, although this of course has its own issues. Plus I think it would preclude virtualization, which necessitates two handles.

I personally don't think Hudson's approach will work, but it's definitely closer/better than previous approaches.(this is relevant only for the early version of the idea, not the updated one written by @lschuermann).

One other point I brought up with Hudson is that this really is a narrow case where we need 2 variants of a trait. My personal take is that generalization for a bounded 2 cases isn't necessary. If we had >2 cases, or a not precisely bounded set, then we want to generalize (e.g., as with time::Ticks, ).

@hudson-ayers
Copy link
Contributor

What prevents BufferHandle from mutating the buffer (while HW has access to it)? It feels like something here is still fundamentally unsafe, but Hudson's example isn't marking anything as unsafe.

unlock() would need to be unsafe, such that it could only be called by the peripheral

@phil-levis
Copy link
Contributor Author
phil-levis commented May 20, 2022

The place where we ran into the mut/imut problem before was RSA keys:

#2839

The resolution there was to have two traits: it doesn't generalize in the same way as AsRef, but it possibly could...

@phil-levis
Copy link
Contributor Author

unlock() would need to be unsafe, such that it could only be called by the peripheral

We've re-invented mutexes! Transitioning from compile-time to run-time checks is definitely a drawback. It might be worth the code de-duplication, though.

That being said, I don't think we want to use unsafe for this purpose. Other code could unlock it as well, just not capsules. We want to be sure that only the locker can unlock it. Could we do this with a functional approach (lock and unlock change the type, stored in an enum)?

@phil-levis
Copy link
Contributor Author
phil-levis commented May 21, 2022

After going down the two traits path for a while, I decided it isn't workable. With RSA it was OK because the traits were pretty narrow. For digests, however, there are many variants; we have capsules like virtual_digest that is simultaneously an HMAC and SHA.

Unlike with RSA keys, it seems restrictive to require a client to choose mutable or immutable data. It's completely reasonable to compute a single hash over data in flash and data in RAM.

The current implementation therefore puts mutable and immutable in a single trait. There's an add_data and an add_mut_data, and two callbacks. To better support this with 8000 LeasableBuffer types, there's now a LeasableBufferDynamic, which behaves similarly to a MutImutBuffer: it isn't exposed in APIs but can be used internally for storing state. It can store either a LeasableBuffer or a LeasableMutableBuffer.

This implementation feels pretty clean to me; the only place where the logic required a refactoring was in the Hmac peripheral for lowrisc. At the bottom level, it borrows the type of leasable buffer as a &dyn Index<usize, Output = u8>, so it can be called with either type. Since the lowrisc implementation has a FIFO and not DMA, the borrow with an anonymous lifetime works.

@phil-levis phil-levis changed the title Add Digest traits and a LeasableBuffer that take non-mutable data Add support for Digest to accept data from RAM (mutable) or Flash (immutable) May 21, 2022
hudson-ayers
hudson-ayers previously approved these changes Jun 14, 2022
@phil-levis phil-levis added the last-call Final review period for a pull request. label Jun 14, 2022
@lschuermann
Copy link
Member

For what its worth, the mutable/immutable buffer passing approach has been a topic of the last core team call. In essence, Hudson, Alyssa and I agree that this is a good solution to solve the particular issue of passing mutable/immutable buffers down the Digest HIL, and unblocks the signature verification work, while we should continue to explore more integrated buffer passing APIs in the near future:

## HMAC and Digest HIL - Mutable/Immutable Buffer Approach
- Hudson: Phil wanted to mention that he has published an updated HMAC HIL and
within that PR a new TRD for Digest. This represents Phil's approach to
calculating digests over that located in flash, memory, or partially in both
regions. This is important to check the integrity of processes which have
their code stored in flash, to avoid copying the entire process.
This might be at odds with the approach Leon and I have been pursuing. Phil's
reasoning is likely that agreeing on his approach would unblock the Digest and
HMAC HILs. But perhaps Leon's codebase is in a state where we can also talk
about using that instead.
- Leon: Been working on Miri tests on the new approach, testing it in a way
which resembles typical usage in Tock. Have not noticed any unsoundness
yet. Needs more work.
In a previous call, at the very end, we talked about a good compromise: Phil's
approach is really good at resolving this single issue in the Digest / HMAC
scenario. Hudson's and my approach seeks to solve more issues, such as the
current DMA buffer unsoundness, LeasableBuffer integration, etc., all in a
single solution and type infrastructure.
Our approach still seems viable and nice, but it needs more time. It seems
trivial to migrate Phil's approach to ours at the given time. I'd be happy
with merging Phil's approach first and then migrating later.
- Hudson: I agree. I still prefer our approach, but I am sympathetic that Phil
is blocked on this. It does not seem like merging his version now is going to
cause trouble when we want to migrate it to our approach.
- Leon: Perfect is the enemy of good. Once Phil's approach is in, the first step
to supporting mutable & immutable buffers in a single API is done. Our
approach will be an improvement on that interface, if it works. Just having
Phil's approach in, though, I going to provide justification to continue
working on this, and extending to existing subsystems.
- Alyssa: Also comes with an example on how the existing API is used in
practice, which is helpful.

lschuermann
lschuermann previously approved these changes Jun 14, 2022
Copy link
Member
@lschuermann lschuermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a careful pass through the TRD and skimmed over the rest, looks good generally, thanks @phil-levis! Just some minor notes on the TRD and the comment regarding LeasableBuffer.

@phil-levis
Copy link
Contributor Author

For what its worth, the mutable/immutable buffer passing approach has been a topic of the last core team call. In essence, Hudson, Alyssa and I agree that this is a good solution to solve the particular issue of passing mutable/immutable buffers down the Digest HIL, and unblocks the signature verification work, while we should continue to explore more integrated buffer passing APIs in the near future:

Sounds good to me. I do think the mutable/immutable tension has pretty limited scope, basically for digests and public keys. But the DMA issue is a deeper and separate one. Having a single simple approach that solves both is of course better than two different ones. The Digest HIL in this PR is definitely intended to be in-flux, and I'm open to improvements.

@phil-levis phil-levis dismissed stale reviews from lschuermann and hudson-ayers via e32d9b0 June 14, 2022 22:10
hudson-ayers
hudson-ayers previously approved these changes Jun 14, 2022
bradjc
bradjc previously approved these changes Jun 17, 2022
@phil-levis
Copy link
Contributor Author

I've tested the OpenTitan HMAC peripheral, it now passes tests. @alistair23

@phil-levis
Copy link
Contributor Author

@alistair23 Can you re-review and clear your changes requested?

@phil-levis phil-levis dismissed alistair23’s stale review June 23, 2022 17:25

Tested on OpenTitan using Verilator; passes tests.

@phil-levis
Copy link
Contributor Author

bors r+

@bors
Copy link
Contributor
bors bot commented Jun 23, 2022

@bors bors bot merged commit 1782d7e into master Jun 23, 2022
@bors bors bot deleted the digest_imut branch June 23, 2022 17:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
HIL This affects a Tock HIL interface. kernel last-call Final review period for a pull request. sam4l Change pertains to the SAM4L MCU. WG-OpenTitan In the purview of the OpenTitan working group.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0