10000 capsules: virtual_uart: avoid double receive, check rx len, support custom buffer by bradjc · Pull Request #1757 · tock/tock · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

capsules: virtual_uart: avoid double receive, check rx len, support custom buffer #1757

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 24, 2020

Conversation

bradjc
Copy link
Contributor
@bradjc bradjc commented Apr 13, 2020

Pull Request Overview

Unexpected behavior: Running the process console causes uart.receive_abort to be called after every character.

Cause: start_receive gets called twice, once with a rx_buffer, and the second time without since it is being used after the first call. The first call happens because process console starts a new receive after handling its current read, and the second call happens because uart mux calls start_receive on its client's behalf after the callback.

Fix: Make sure that the buffer used by virtual_uart for underlying UART receives isn't replaced before client callbacks are completed. This causes all receives in callbacks to be queued until they have all completed and the virtual_uart receive handler can determine the shortest read to issue.

Other Changes

  • Do not assume the buffer used for virtual_uart receive is the default one, and instead use the actual buffer length for the maximum receive length.
  • Check for ESIZE when a client issues a read.

This pull request adds/changes/fixes...

Testing Strategy

Running process console on hail and verifying it works.

Added a debug statement to sam4l::usart::abort and it no longer is printed.

Tried the imix/src/test/virtual_uart_rx_test on hail and it works as expected.

TODO or Help Wanted

n/a

Documentation Updated

  • Updated the relevant files in /docs, or no updates are required.

Formatting

  • Ran make formatall.

@bradjc bradjc changed the title capsules: virt_uart: avoid double receive capsules: virtual_uart: avoid double receive Apr 13, 2020
@ppannuto ppannuto requested a review from phil-levis April 13, 2020 20:03
alistair23
alistair23 previously approved these changes Apr 13, 2020
@bradjc bradjc mentioned this pull request Apr 17, 2020
69 tasks
@phil-levis
Copy link
Contributor

Can you document when this happens? Amit noted that it happens on some boards and not others. So I want to understand that's causing that discrepancy. My read-through of the change and code suggests that it is OK, but this is something that needs a lot of testing. The virtual UART receive path is pretty tricky logic with edge cases.

Copy link
Contributor
@phil-levis phil-levis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide a test case when this happens. I want to understand why it happens in some cases and not in others. The virtual receive logic is tricky and has to handle a bunch of edge cases. I'm worried that this patch will cause it to miss one.

@phil-levis
Copy link
Contributor

One concern -- it's important that the read not happen directly in the read_done callback. I.e., a client issuing a read in its callback is not serviced immediately. The reason is that might be an outstanding shorter read.

I.e.,

client A requests read of 10
client B requests read of 12
10 bytes are read
A signaled read_done
client A requests read of 10
2 bytes are read
B signaled read_done
8 bytes area read
A signaled read_done

If the second A call stops the 2 byte read this is a bug.

@pftbest
Copy link
pftbest commented Apr 17, 2020

I posted this example code on slack:

https://gist.github.com/pftbest/c5d89849e22d103278a104704c4624cb

Compile it for desktop, no device required.

It has only one reader, but every call to receive is followed by an abort call.

@bradjc
Copy link
Contributor Author
bradjc commented Apr 20, 2020

One test case is just to run the process_console with no apps on master. To see the abort calls, I added this patch:

diff --git a/chips/sam4l/src/usart.rs b/chips/sam4l/src/usart.rs
index e7b7fd4f4..cb82195da 100644
--- a/chips/sam4l/src/usart.rs
+++ b/chips/sam4l/src/usart.rs
@@ -12,6 +12,7 @@ use kernel::hil;
 use kernel::hil::spi;
 use kernel::hil::uart;
 use kernel::ReturnCode;
+use kernel::debug_gpio;

 use crate::dma;
 use crate::pm;
@@ -865,6 +866,9 @@ impl uart::Receive<'a> for USART<'a> {

     fn receive_abort(&self) -> ReturnCode {
         let usart = &USARTRegManager::new(&self);
+
+        debug_gpio!(0, toggle);
+
         self.disable_rx_timeout(usart);
         self.abort_rx(usart, ReturnCode::ECANCEL, uart::Error::Aborted);
         ReturnCode::EBUSY

@phil-levis
Copy link
Contributor
phil-levis commented Apr 23, 2020

I forgot to bring an imix home when Stanford shut down. I do have a launchxl, but the instructions to build it do not work (make flash-ccfg fails). So I have not tested this. But...

I think this bug fix is incorrect. I'd appreciate if the "Testing" section of the pull request reported running the UART virtualization tests.

Let's call the read that comes in during the callback the new read and any that are already occurring the pending reads.

The problem is that start_receive is being called twice. This should not happen. In particular, in the case that @bradjc describes, the new read should not result in an underlying read being issued. It is important that the virtualizer looks at all of the reads -- new new and pending -- be considered for the shortest one (the edge case I described above). Instead, the new read should just be added to the queue, then later, its length should be considered when figuring out how long a read to issue.

Why is it being called twice? We set the receive_pending flag -- why can the new read start an underlying read?

The problem is that whether you can start an underlying read is controlled by the presence of buffer. This is replaced before we issue the callback (line 118), and so when the new read comes in, it's there, and so the new read starts immediately (line 118). This replacement of buffer should occur when completing_read is set to false (line 155).

I think there's another lingering bug. If the new read is the shortest read, its length is not considered in the length of the next read to perform. So there also needs to be a line added to the if clause in lines 132-134, which sets next_read_len to min(next_read_len, device.rx_len.get()).

So, in short, I think:

  • If you apply the patch as is, then reads issued in callbacks can cause other clients to miss bytes. If another client has a short read, and a read issued in a callback makes a long read, the underlying UART will issue a long read. The client with a short read will miss all of the bytes in the long read after its own. E.g., if the short read is 5 bytes and the long one is 120, the short read will have a receive of 5 bytes signaled after 120, and miss bytes 5-119.
  • If you apply the changes I suggest, then this means that when a read is issued in a callback, its enqueued but not executed. When the loop across all devices completes, the virtualizer will issue a read whose length is the shortest outstanding read length.

I'll try to figure out a way to get an imix.

@bradjc
Copy link
Contributor Author
bradjc commented Apr 23, 2020

If another client has a short read, and a read issued in a callback makes a long read, the underlying UART will issue a long read. The client with a short read will miss all of the bytes in the long read after its own. E.g., if the short read is 5 bytes and the long one is 120, the short read will have a receive of 5 bytes signaled after 120, and miss bytes 5-119.

Ok yes that makes sense. This still needs to be addressed.

However, I don't think receive.receive_buffer can only enqueue, since if the receive call is the first and only call to receive, there won't be another call to actually start the receive. A deferred call could provide the callback needed to actually start the receive, but this would be redundant if the receive.receive_buffer() stems from a callback.

So it seems like either:

  • Having receive.receive_buffer() enqueue a receive and then setup a deferred call to check all of the enqueued receives, and then issue the smallest one. I think there could be a state variable which skips the deferred call if the callback iter is happening, but this seems a little difficult to reason about.
  • Enhancing the MuxUart.start_receive() function. It would still actually issue the first receive to the hardware, but on subsequent reads it would check if the new read is smaller, and only then call abort(), and otherwise keep the existing logic plus what is currently in this PR.

would address the long/short receive issue.

@phil-levis
Copy link
Contributor

However, I don't think receive.receive_buffer can only enqueue, since if the receive call is the first and only call to receive, there won't be another call to actually start the receive.

I think it can? The end of received_buffer starts the next receive:

self.start_receive(next_read_len);

read_pending is set here:

read_pending = true;

bradjc added 3 commits April 24, 2020 10:08
Unexpected behavior: Running the process console causes
uart.receive_abort to be called after every character.

Cause: start_receive gets called twice, once with a rx_buffer, and the
second time without since it is being used after the first call. The
first call happens because process console starts a new receive after
handling its current read, and the second call happens because uart mux
calls start_receive on its client's behalf after the callback.

Fix: Remove the mux calling start_receive on its client's behalf.
This fixes the issue where receives issued from callbacks were always
resulting in abort() calls by not replacing the underlying UART receive
buffer until after all of the client callbacks ha
8000
ve finished. This
"flag", in conjunction with the `completing_read` bool are what
start_receive uses to determine whether to start or abort a receive.

This also fixes determining the minimum length of a receive by
calculating it in all cases where another receive is needed.
Without this check, the virtual uart will continue issuing reads to
receive rx_len bytes, even if the buffer is not big enough to store
them.
@bradjc bradjc force-pushed the virtual-uart-no-double-read branch from 2ad8e51 to 6334749 Compare April 24, 2020 14:46
@bradjc bradjc changed the title capsules: virtual_uart: avoid double receive capsules: virtual_uart: avoid double receive, check rx len, support custom buffer Apr 24, 2020
@bradjc
Copy link
Contributor Author
bradjc commented Apr 24, 2020

Ok, discussed the fix with @phil-levis, and updated now. At a high level, the fix is to properly enqueue all receive requests issued during a receive callback until all of the callbacks have finished. Before, the first receive issued from a callback was improperly getting issued to the hardware. Receive requests issued at other times when there is not an active underlying receive are not enqueued, and issue immediately, as expected.

@phil-levis
Copy link
Contributor

Can you run the virtual UART test?

@bradjc
Copy link
Contributor Author
bradjc commented Apr 24, 2020

I ran the virtual uart test on hail and it works just as expected.

@bradjc bradjc merged commit 6ec37b2 into master Apr 24, 2020
@bors bors bot deleted the virtual-uart-no-double-read branch April 24, 2020 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0