8000 Update OpenTitan to `Earlgrey-M2.5.2-RC0` by cfrantz · Pull Request #3586 · tock/tock · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Update OpenTitan to Earlgrey-M2.5.2-RC0 #3586

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Aug 8, 2023

Conversation

cfrantz
Copy link
Contributor
@cfrantz cfrantz commented Aug 1, 2023

Pull Request Overview

This pull request updates OpenTitan to refer to Earlgrey-M2.5.2-RC0. This tag represents the version of the OpenTitan codebase that will go into the first OpenTitan chip.

Testing Strategy

This pull request was tested by booting the tock kernel on a CW310 FPGA board with the Earlgrey-M2.5.2-RC0 bitstream.

TODO or Help Wanted

Documentation Updated

  • This PR deletes most of boards/opentitan/README.md and leaves TODOs to write new documentation.

Formatting

  • Ran make prepush.

@github-actions github-actions bot added the WG-OpenTitan In the purview of the OpenTitan working group. label Aug 1, 2023
lschuermann
lschuermann previously approved these changes Aug 1, 2023
Copy link
Member
@lschuermann lschuermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially none of the getting started / set up instructions previously contained in the opentitan/earlgrey-cw310 crate's README work on the tape-out revision of the upstream OpenTitan repository (or on any recent development revision).

This PR is a net-positive, as it gets us closer to having upstream Tock work on the "frozen" OpenTitan chip. From talking to @cfrantz, it is the first in a series of PRs to align with this fixed hardware revision, and it makes sense to me to finalize the usage instructions once Tock is confirmed to work on that fixed target.

We should eventually resurrect & update some of the instructions around app loading & OTBN support, once those are confirmed to be working as well.

jrvanwhy
jrvanwhy previously approved these changes Aug 1, 2023
ppannuto
ppannuto previously approved these changes Aug 1, 2023
Copy link
Member
@ppannuto ppannuto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a quick glance, it doesn't look like the linked OT guide has any directions for running Tock (yet?), but in the long term, if it helps avoid duplication and makes sure things don't fall out of date here, it's perfectly fine for this readme to just link to specific sections of OT resources as-appropriate.

Copy link
Contributor
@alistair23 alistair23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you are just deleting all of the documentation to help users get started. We should leave these in as they explain to people how to run Tock on OpenTitan. If any changes are required let's just make the changes directly instead of deleting everything

Comment on lines 38 to 63
### Setup OpenTitan

```shell
git clone https://github.com/lowRISC/opentitan.git
cd opentitan

# Use the OpenTitan_SHA currently supported by Tock
git checkout <OpenTitan_SHA>
pip3 install --user -r python-requirements.txt
```
Make sure to follow [OpenTitan getting started instructions](https://docs.opentitan.org/doc/getting_started/) to setup required dependencies/toolchains.

### **Fedora dependencies quick install**

Note: the OpenTitan documentation provides an easy installation for packages for Ubuntu based distributions. This is an equivalent command to install the (mostly) same packages for Fedora users.

```shell
sudo dnf install autoconf bison make automake gcc gcc-c++ kernel-devel \
clang-tools-extra clang cmake curl \
doxygen flex g++ git golang lcov elfutils-libelf \
libftdi libftdi-devel ncurses-compat-libs openssl-devel \
systemd-devel libusb redhat-lsb-core \
make ninja-build perl pkgconf python3 python3-pip python3-setuptools \
python3-urllib3 python3-wheel srecord tree xsltproc zlib-devel xz clang-tools-extra \
clang11-libs clang-devel elfutils-libelf-devel
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should leave this in, it's currently tested and works

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is covered by OpenTitan's getting started guide.

With the recent adoption of PEP-668 by Debian distributions, the pip3 install line is out-of-date (to be fair, it's also out-of-date in the OT docs, but we should really be referring to a single source-of-truth concerning system setup).

I'd prefer to see the Fedora blurb get added to OT's documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, we can drop the Setup OpenTitan section and point to the OT docs.

I agree the Fedora steps being in OT official docs would be nice. Let's leave them here until that happens

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 65 to 94
ChipWhisper CW310
-----------------

To use `make flash` you first need to clone the OpenTitan repo and ensure that
the Python dependencies are installed.

Then you need to build the OpenTitan tools:

```shell
./bazelisk.sh build //sw/host/opentitantool
```

You might need to run these commands to get it to work

```shell
ln -s /usr/bin/ld.lld /usr/sbin/ld.lld
ln -s /usr/bin/gcc /usr/sbin/gcc
```

Next connect to the board's serial with a second terminal:

```shell
screen /dev/ttyACM1 115200,cs8,-ixon,-ixoff
```

Then you need to flash the bitstream with:


```shell
./bazel-bin/sw/host/opentitantool/opentitantool.runfiles/lowrisc_opentitan/sw/host/opentitantool/opentitantool --interface=cw310 fpga load-bitstream lowrisc_systems_chip_earlgrey_cw310_0.1.bit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, we should leave these steps in, why would we want to remove them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the documentation changes seem too controversial, I've reverted them. We can create a new PR or issue to discuss changing the docs.

@alistair23
Copy link
Contributor

This PR is a net-positive, as it gets us closer to having upstream Tock work on the "frozen" OpenTitan chip. From talking to @cfrantz, it is the first in a series of PRs to align with this fixed hardware revision, and it makes sense to me to finalize the usage instructions once Tock is confirmed to work on that fixed target.

Considering the only change to get Tock running on the latest hardware is a linker script change we should just make the linker script change and SHA change and start from there.

Also, can you run the unit tests. Tock has an extensive range of tests for OpenTitan, we don't want to regress those.

@twilfredo
Copy link
Contributor
twilfredo commented Aug 3, 2023
  1. It looks like we need a few more changes to catchup with this release. Running this build on Verilator triggers a StoreFault. It looks like were accessing out of region RAM. @alistair23 This isn't particular to the AES test either, any ideas?
I00001 test_rom.c:243] Test ROM complete, jumping to flash (addr: 20000400)!
Unable to find otbn-rsa, disabling RSA support
OpenTitan initialisation complete. Entering main loop
check run AES128 CBC... 
aes_test passed (CBC Enc Src/Dst)
aes_test passed (CBC Enc In-place)
aes_test passed (CBC Dec Src/Dst)
aes_test passed (CBC Dec In-place)
    [ok]
check run AES128 CCM... 
AES CCM* encryption/decryption tests


panicked at 'fatal exception: StoreFault: 0x1000062c', chips/earlgrey/src/chip.rs:309:13
        Kernel version release-2.1-1396-g74e3967c2

---| No debug queue found. You can set it with the DebugQueue component.

---| OpenTitan Earlgrey configuration for sim_verilator |---
---| RISC-V Machine State |---
Last cause (mcause): Store/AMO access fault (interrupt=0, exception code=0x00000007)
Last value (mtval):  0x1000062C

System register dump:
 mepc:    0x20005AF4    mstatus:     0x00001880
 mcycle:  0x0014040F    minstret:    0x0009FB52
 mtvec:   0x20000501
 mstatus: 0x00001880
  uie:    false  upie:   false
  sie:    false  spie:   false
  mie:    false  mpie:   true
  spp:    false
 mie:   0x00000808   mip:   0x00000000
  usoft:  false               false 
  ssoft:  false               false 
  msoft:  true                false 
  utimer: false               false 
  stimer: false               false 
  mtimer: false               false 
  uext:   false               false 
  sext:   false               false 
  mext:   true                false 
 ePMP regions:
  [0]: addr=0x00000000, end=0x00000000, cfg=0x0 (OFF) (----)
  [1]: addr=0x00000000, end=0x00000000, cfg=0x0 (OFF) (----)
  [2]: addr=0x00000000, end=0x00000000, cfg=0x0 (OFF) (----)
  [3]: addr=0x00000000, end=0x00000000, cfg=0x0 (OFF) (----)
  [4]: addr=0x00000000, end=0x00000000, cfg=0x0 (OFF) (----)
  [5]: addr=0x00000000, end=0x00000000, cfg=0x0 (OFF) (----)
  [6]: addr=0x00000000, end=0x40000000, cfg=0x0 (OFF) (----)
  [7]: addr=0x40000000, end=0x49000000, cfg=0x8B (TOR) (lrw-)
  [8]: addr=0x49000000, end=0x100092A4, cfg=0x0 (OFF) (----)
  [9]: addr=0x100092A4, end=0x10010000, cfg=0x8B (TOR) (lrw-)
  [10]: addr=0x10010000, end=0x20060000, cfg=0x0 (OFF) (----)
  [11]: addr=0x20060000, end=0x20090000, cfg=0x8B (TOR) (lrw-)
  [12]: addr=0x20090000, end=0x20000000, cfg=0x0 (OFF) (----)
  [13]: addr=0x20000000, end=0x2002BCA8, cfg=0x8D (TOR) (lr-x)
  [14]: addr=0x2002BCA8, end=0x10000650, cfg=0x0 (OFF) (----)
  [15]: addr=0x10000650, end=0x100092A4, cfg=0x8B (TOR) (lrw-)

---| App Status |---
panicked at 'fatal exception: StoreFault: 0x1000062c', chips/earlgrey/src/chip.rs:309:13
  1. This needs a few updates to the Makefile, updating the static_critical section etc. I have made some patches here, still WIP. https://github.com/twilfredo/tock/commits/wilfred/bump-ot

@alistair23
Copy link
Contributor

It seems to be outside a PMP region, we probably need to change this region to start a little earlier. The best bet is to have a look at the elf symbols and see what we should use instead

[15]: addr=0x10000650, end=0x100092A4, cfg=0x8B (TOR) (lrw-)

@cfrantz
Copy link
Contributor Author
cfrantz commented Aug 3, 2023

panicked at 'fatal exception: StoreFault: 0x1000062c', chips/earlgrey/src/chip.rs:309:13
Kernel version release-2.1-1396-g74e3967c2
[15]: addr=0x10000650, end=0x100092A4, cfg=0x8B (TOR) (lrw-)

The store fault location is beyond the end of the stack (ie: stack overflow). I added a bit more stack, after which all tests passed except the AES test (different failure) and the flash_ctrl test.

Earlgrey-M2.5.2-RC0 is the version of OpenTitan that will be in the
first Earlgrey chip.

Signed-off-by: Chris Frantz <cfrantz@google.com>
@cfrantz cfrantz force-pushed the ot-earlgrey-es-manifest branch 2 times, most recently from 75684eb to 5e273bb Compare August 3, 2023 22:27
@cfrantz
Copy link
Contributor Author
cfrantz commented Aug 3, 2023
  1. This needs a few updates to the Makefile, updating the static_critical section etc. I have made some patches here, still WIP. https://github.com/twilfredo/tock/commits/wilfred/bump-ot

Your Makefile changes look good to me and I've pulled them into this PR.

The changes to the RAM layout to preserve the static_critical section aren't necessarily needed. That section is inteded to be used so that the ROM can hand off execution to the ROM_EXT and the ROM_EXT can verify assertions made by the ROM. It is possible for the owner-level code (the OS) to again validate those assertions, but considering the size of the static_critical section (nearly 8K), the most likely scenario would be to validate the assertions in some pre-kernel init code and either (a) panic if the assertions are wrong or (b) discard the section and make the RAM available to the OS.

I'd suggest we defer dealing with the static_critical section until after the ROM_EXT is in development and there is a formal spec stating how the static_critical section is intended to be used in the hand-off from ROM_EXT to owner code.

@twilfredo
Copy link
Contributor
  1. This needs a few updates to the Makefile, updating the static_critical section etc. I have made some patches here, still WIP. https://github.com/twilfredo/tock/commits/wilfred/bump-ot

Your Makefile changes look good to me and I've pulled them into this PR.

The changes to the RAM layout to preserve the static_critical section aren't necessarily needed. That section is inteded to be used so that the ROM can hand off execution to the ROM_EXT and the ROM_EXT can verify assertions made by the ROM. It is possible for the owner-level code (the OS) to again validate those assertions, but considering the size of the static_critical section (nearly 8K), the most likely scenario would be to validate the assertions in some pre-kernel init code and either (a) panic if the assertions are wrong or (b) discard the section and make the RAM available to the OS.

I'd suggest we defer dealing with the static_critical section until after the ROM_EXT is in development and there is a formal spec stating how the static_critical section is intended to be used in the hand-off from ROM_EXT to owner code.

Ah did not know that, sounds good then.

@twilfredo
Copy link
Contributor

panicked at 'fatal exception: StoreFault: 0x1000062c', chips/earlgrey/src/chip.rs:309:13
Kernel version release-2.1-1396-g74e3967c2
[15]: addr=0x10000650, end=0x100092A4, cfg=0x8B (TOR) (lrw-)

The store fault location is beyond the end of the stack (ie: stack overflow). I added a bit more stack, after which all tests passed except the AES test (different failure) and the flash_ctrl test.

Running the newer changes on Verilator (with the stack size increase), I'm still seeing the overflow. Were your tests on HW?

I00000 test_rom.c:160] kChipInfo: scm_revision=61abdf5c
I00001 test_rom.c:243] Test ROM complete, jumping to flash (addr: 20000400)!
Unable to find otbn-rsa, disabling RSA support
OpenTitan initialisation complete. Entering main loop
check run AES128 CBC... 
aes_test passed (CBC Enc Src/Dst)
aes_test passed (CBC Enc In-place)
aes_test passed (CBC Dec Src/Dst)
aes_test passed (CBC Dec In-place)
    [ok]
check run AES128 CCM... 
AES CCM* encryption/decryption tests


panicked at 'fatal exception: StoreFault: 0x1000062c', chips/earlgrey/src/chip.rs:309:13
	Kernel version release-2.1-1397-ge5b09c8c8

---| No debug queue found. You can set it with the DebugQueue component.

---| OpenTitan Earlgrey configuration for sim_verilator |---
---| RISC-V Machine State |---
Last cause (mcause): Store/AMO access fault (interrupt=0, exception code=0x00000007)
Last value (mtval):  0x1000062C

System register dump:
 mepc:    0x20005AF4    mstatus:     0x00001880
 mcycle:  0x0014040F    minstret:    0x0009FB52
 mtvec:   0x20000501
 mstatus: 0x00001880
  uie:    false  upie:   false
  sie:    false  spie:   false
  mie:    false  mpie:   true
  spp:    false
 mie:   0x00000808   mip:   0x00000000
  usoft:  false               false 
  ssoft:  false               false 
  msoft:  true                false 
  utimer: false               false 
  stimer: false               false 
  mtimer: false               false 
  uext:   false               false 
  sext:   false               false 
  mext:   true                false 
 ePMP regions:
  [0]: addr=0x00000000, end=0x00000000, cfg=0x0 (OFF) (----)
  [1]: addr=0x00000000, end=0x00000000, cfg=0x0 (OFF) (----)
  [2]: addr=0x00000000, end=0x00000000, cfg=0x0 (OFF) (----)
  [3]: addr=0x00000000, end=0x00000000, cfg=0x0 (OFF) (----)
  [4]: addr=0x00000000, end=0x00000000, cfg=0x0 (OFF) (----)
  [5]: addr=0x00000000, end=0x00000000, cfg=0x0 (OFF) (----)
  [6]: addr=0x00000000, end=0x40000000, cfg=0x0 (OFF) (----)
  [7]: addr=0x40000000, end=0x49000000, cfg=0x8B (TOR) (lrw-)
  [8]: addr=0x49000000, end=0x100092A4, cfg=0x0 (OFF) (----)
  [9]: addr=0x100092A4, end=0x10020000, cfg=0x8B (TOR) (lrw-)
  [10]: addr=0x10020000, end=0x20060000, cfg=0x0 (OFF) (----)
  [11]: addr=0x20060000, end=0x20090000, cfg=0x8B (TOR) (lrw-)
  [12]: addr=0x20090000, end=0x20000000, cfg=0x0 (OFF) (----)
  [13]: addr=0x20000000, end=0x2002BCA8, cfg=0x8D (TOR) (lr-x)
  [14]: addr=0x2002BCA8, end=0x10000650, cfg=0x0 (OFF) (----)
  [15]: addr=0x10000650, end=0x100092A4, cfg=0x8B (TOR) (lrw-)

---| App Status |---
panicked at 'fatal exception: StoreFault: 0x1000062c', chips/earlgrey/src/chip.rs:309:13

@twilfredo
Copy link
Contributor
twilfredo commented Aug 4, 2023

It seems to be outside a PMP region, we probably need to change this region to start a little earlier. The best bet is to have a look at the elf symbols and see what we should use instead

[15]: addr=0x10000650, end=0x100092A4, cfg=0x8B (TOR) (lrw-)

Yeah, in main.rs just experimenting with this change

diff --git a/boards/opentitan/src/main.rs b/boards/opentitan/src/main.rs
index cb7b83296..46727251b 100644
--- a/boards/opentitan/src/main.rs
+++ b/boards/opentitan/src/main.rs
@@ -696,9 +696,12 @@ unsafe fn setup() -> (
     let mut mpu_config = rv32i::epmp::PMPConfig::kernel_default();
 
     // The kernel stack, BSS and relocation data
+    let s_start: *const u8 = ((&_sstack as *const u8 as usize) - 0x64 as usize) as *const u8;
     chip.pmp
         .allocate_kernel_region(
-            &_sstack as *const u8,
+            s_start,
             &_ezero as *const u8 as usize - &_sstack as *const u8 as usize,
             mpu::Permissions::ReadWriteOnly,
             &mut mpu_config,

The storefault occurs just outside of the newly allocated stack region at fatal exception: StoreFault: 0x100005e8. So somethings causing an overflow I suppose.

@cfrantz
Copy link
Contributor Author
cfrantz commented Aug 4, 2023

The storefault occurs just outside of the newly allocated stack region at fatal exception: StoreFault: 0x100005e8. So somethings causing an overflow I suppose.

I'm sorry: I forgot to push the stack size change. Could you try giving one more kilobyte of stack space in main.rs (change from 0x1000 bytes to 0x1400 bytes) and see if that helps?

@twilfredo
Copy link
Contributor

The storefault occurs just outside of the newly allocated stack region at fatal exception: StoreFault: 0x100005e8. So somethings causing an overflow I suppose.

I'm sorry: I forgot to push the stack size change. Could you try giving one more kilobyte of stack space in main.rs (change from 0x1000 bytes to 0x1400 bytes) and see if that helps?

Ah right! cool! that seems to have gotten around the overflow.

@twilfredo
Copy link
Contributor

#3590 is a fix for the flash failures (unrelated to the OT bump). These tests previously used to run in the order listed (rust update?). Which is required, since we lock some regions for testing access permissions.

cfrantz and others added 4 commits August 4, 2023 17:16
- Update the manifest layout in the linker scripts to match the
  expectations of the OpenTitan `test_rom`.
- Fix missing _manifest symbol board initialization code.
- Add the manifest extension table.
- Add the `manifest_version` field.
- Update the RAM size: OpenTitan has 128K of SRAM.
- Eliminate the referense to the non-existent
  `.riscv.open_titan_trap_vectored` section.

Co-authored-by: Michał Mazurek <maz@semihalf.com>
Signed-off-by: Chris Frantz <cfrantz@google.com>
Signed-off-by: Michał Mazurek <maz@semihalf.com>
Adds required changes to build against opentitan Earlgrey-M2.5.2-RC0.

Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
The tests overflow the stack; give some more space so they can run.

Signed-off-by: Chris Frantz <cfrantz@google.com>
These tests need to run in a particular order, because we are locking
certain flash regions (sticky) for negative testing. The test-runner
seems to invoke them alphabetically by fn name.

Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
The AES tests are order-sensitive.

The CCM and GCM tests depend on the VirtualAES128 peripheral
configured during kernel initialization in `main.rs`.  In order
to operate correctly, the virtual peripheral depends on the
`set_client` calls to the low-level AES driver to configure the
virtual peripheral correctly.

The ECB, CBC and CTR tess, on the other hand, call `set_client`
on the AES driver, thus breaking the VirtualAES128 configuration.

Run the CCM and GCM tests first, then run the ECB, CBC and CTR tests.

Signed-off-by: Chris Frantz <cfrantz@google.com>
@cfrantz cfrantz force-pushed the ot-earlgrey-es-manifest branch from ec92e08 to 5a1668d Compare August 5, 2023 00:17
@cfrantz
Copy link
Contributor Author
cfrantz commented Aug 5, 2023

#3590 is a fix for the flash failures (unrelated to the OT bump). These tests previously used to run in the order listed (rust update?). Which is required, since we lock some regions for testing access permissions.

I've pulled the commit from #3590 into this PR.

It appears the AES test failures are also related to test ordering and I've added a commit that sets their ordering so the work. See my commit message to understand the ordering.

I can successfully run all of the tests on a CW310 FPGA board with the Earlgrey-M2.5.2-RC0 bitstream.

@twilfredo
Copy link
Contributor
twilfredo commented Aug 7, 2023

#3590 is a fix for the flash failures (unrelated to the OT bump). These tests previously used to run in the order listed (rust update?). Which is required, since we lock some regions for testing access permissions.

I've pulled the commit from #3590 into this PR.

It appears the AES test failures are also related to test ordering and I've added a commit that sets their ordering so the work. See my commit message to understand the ordering.

I can successfully run all of the tests on a CW310 FPGA board with the Earlgrey-M2.5.2-RC0 bitstream.

Nice! I haven't tested loading an OTBN binary (needs Vivado 2020.2) installed to generate a loadable elf. Did your tests include otbn/rsa testing aswell? If the binary was not loaded, the test suite will skip these tests with "Unable to find otbn-rsa, disabling RSA support".

@cfrantz
Copy link
Contributor Author
cfrantz commented Aug 7, 2023

Nice! I haven't tested loading an OTBN binary (needs Vivado 2020.2) installed to generate a loadable elf. Did your tests include otbn/rsa testing aswell? If the binary was not loaded, the test suite will skip these tests with "Unable to find otbn-rsa, disabling RSA support".

The instructions in the the README for creating the rsa.tbf are incorrect and do not work for me.

  1. You don't need Vivado. You need the free "Vivado Lab Tools".
  2. You don't even need "Vivado Lab Tools". The listed target //sw/device/tests:otbn_rsa_test is not the label of the rule that creates the then later listed .../bin/sw/otbn/crypto/rsa.elf. That binary is created by the rule //sw/otbn/crypto:rsa.
  3. When creating the TBF file, elf2tab (version 0.11.0) errors with:
    Creating "./bazel-out/k8-fastbuild-ST-2cc462681f62/bin/sw/otbn/crypto/rsa.tbf"
    Min RAM size from segments in ELF: 1600 bytes
    Number of writeable flash regions: 0
    Kernel version: 2.0
    thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Custom { kind: InvalidInput, error: "protected_region_size = 0 is too small for the TBF headers. Header size: 88" }', /home/cfrantz/.cargo/registry/src/github.com-1ecc6299db9ec823/elf2tab-0.11.0/src/main.rs:150:10
    
    This error can be fixed by adding --protected-region-size=88.
  4. If you load the created TBF (make APP=<file> test-hardware), the test panics with Err(SIZE). The calculation of the size in otbn.rs causes an underflow, resulting in a calculated app size of -104 bytes (which is expressed as a u32, so its a rather huge value).
    It appears that the use of dmem_length in the calculation is the problem. I don't think the dmem length has any presence in the created TBF file - it's .bss after all (ie: objdump the generated rsa.elf).
  5. After correcting the length calculation, the test fails in rsa.rs.

I'm no expert in the design and execution of this particular test. I don't even know if the interface between the //sw/otbn/crypto:rsa program is the same now as when this test was originally written.

I would prefer to mark this test as broken for now and get Tock updated to the Earlgrey_ES hardware revision and fix this test later (or more precisely: have someone familiar with the proper operation of this test investigate the failure and fix it).

Copy link
Contributor
@bradjc bradjc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI passes, what is the state of merging this?

@cfrantz
Copy link
Contributor Author
cfrantz commented Aug 8, 2023

CI passes, what is the state of merging this?

IMHO, I think this PR is ready to go.

  • The otbn test fails, but the nature of the problem is unclear to me and I think someone familiar with that test should investigate and send a fix as a follow-up PR.
  • The AES and flash tests are now working, but they probably could use some work. Currently, those tests rely on a specific test ordering to pass. That should be fixed in a follow-up PR.

Getting this PR submitted will clear the way for #3587 and #3597 as well as some additional auto-generated hardware constants work I'd like to submit.

@alistair23 @twilfredo WDYT?

@alistair23
Copy link
Contributor
alistair23 commented Aug 8, 2023

I really don't like merging something that breaks the unit tests. They are really useful for testing changes and catching regressions in Tock.

I think in this case it's probably worth going ahead and merging this though. We can update the OTBN/RSA unit tests next. Catching up to the tapeout release is pretty cool!

bradjc
bradjc approved these changes Aug 8, 2023
@bradjc bradjc added this pull request to the merge queue Aug 8, 2023
Merged via the queue into tock:master with commit 4e04bfd Aug 8, 2023
@twilfredo
Copy link
Contributor

Nice! I haven't tested loading an OTBN binary (needs Vivado 2020.2) installed to generate a loadable elf. Did your tests include otbn/rsa testing aswell? If the binary was not loaded, the test suite will skip these tests with "Unable to find otbn-rsa, disabling RSA support".

The instructions in the the README for creating the rsa.tbf are incorrect and do not work for me.

1. You don't need Vivado.  You need the free "Vivado Lab Tools".

2. You don't even need "Vivado Lab Tools".  The listed target `//sw/device/tests:otbn_rsa_test` is not the label of the rule that creates the then later listed `.../bin/sw/otbn/crypto/rsa.elf`.  That binary is created by the rule `//sw/otbn/crypto:rsa`.

Ah right didn't know that, the tests probably just invoked this then. Cheers.

3. When creating the TBF file, `elf2tab` (version 0.11.0) errors with:
   ```
   Creating "./bazel-out/k8-fastbuild-ST-2cc462681f62/bin/sw/otbn/crypto/rsa.tbf"
   Min RAM size from segments in ELF: 1600 bytes
   Number of writeable flash regions: 0
   Kernel version: 2.0
   thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Custom { kind: InvalidInput, error: "protected_region_size = 0 is too small for the TBF headers. Header size: 88" }', /home/cfrantz/.cargo/registry/src/github.com-1ecc6299db9ec823/elf2tab-0.11.0/src/main.rs:150:10
   ```
   
   
       
         
       
   
         
       
   
       
     
   This error can be fixed by adding `--protected-region-size=88`.

4. If you load the created TBF (`make APP=<file> test-hardware`), the test panics with `Err(SIZE)`.  The calculation of the size in [otbn.rs](https://github.com/tock/tock/blob/master/boards/opentitan/src/otbn.rs#L166) causes an underflow, resulting in a calculated app size of `-104` bytes (which is expressed as a u32, so its a rather huge value).
   It _appears_ that the use of `dmem_length` in the calculation is the problem.  I don't think the dmem length has any presence in the created TBF file - it's `.bss` after all (ie: `objdump` the generated `rsa.elf`).

5. After correcting the length calculation, the test fails in [rsa.rs](https://github.com/tock/tock/blob/master/boards/opentitan/src/otbn.rs#L166).

I'm no expert in the design and execution of this particular test. I don't even know if the interface between the //sw/otbn/crypto:rsa program is the same now as when this test was originally written.

I would prefer to mark this test as broken for now and get Tock updated to the Earlgrey_ES hardware revision and fix this test later (or more precisely: have someone familiar with the proper operation of this test investigate the failure and fix it).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WG-OpenTitan In the purview of the OpenTitan working group.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants
0