8000 IDL gen improvements by ivmarkov · Pull Request #233 · project-chip/rs-matter · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

IDL gen improvements #233

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 41 commits into
base: main
Choose a base branch
from

Conversation

ivmarkov
Copy link
Contributor
@ivmarkov ivmarkov commented Apr 20, 2025

This PR is finishing the work @andy31415 initiated ~ 1 year ago.
NOTE: It is on-top of PR #232 which needs to be merged first.

(
PR still in draft, because
(a) unit tests are not updated and fail Done
(b) idl.rs needs to be split as the code inside it grew considerably Done
)

The TL;DR is: we are now able to generate - from the IDL - artefacts for everything contained in a server cluster IDL definition!

Enums and (bit)sets

This were already available

Simple enums for attribute and command IDs (AttrributeId, CommandId and CommandResponseId)

Work was almost done, I just completed it

(NEW) "Thin" Structures

The generation of structures was incomplete as it couldn't handle nested structs, arrays, nullables and optionals.
The generation of structures was totally revamped though, as we now generate different types of structs than the original idea.
These two types of structs are generated regardless whether the struct is a general one, or a request, or a response one (so that we can support matter-controller/client scenarios in addition to matter-server, as we do now).

Structs for incoming data

Assuming a

FooStruct { 
    f1: u32,
    f2: BarStruct,
}

...struct in the IDL, a

#[repr(transparent)] #[derive(Debug, FromTLV, ToTLV)] FooStruct<'a>(TLVElement<'a>);

impl<'a> FooStruct<'a> {
    pub const fn new(element: TLVElement) -> Self { Self(element) }

    pub fn f1(&self) -> u32 {
        FromTLV::from_tlv(self.0.struct()?.ctx(0)?)
    }

    pub fn f2(&self) -> BarStruct {
        FromTLV::from_tlv(self.0.struct()?.ctx(1)?)
    }
}

... type is generated in Rust.

This type of struct has a minimal footprint (the size of a wide pointer in Rust) in that it does all parsing of its field members "on the fly", when the method of the struct modeling the relevant field member is called by the user.

In other words, we trade extra computation for less memory footprint.
In future - and if deemed necessary - these generated structs might get an extra expand method that returns an "expanded" traditional FromTLV struct which contains actual fields (but has a much bigger memory footprint).

Structs for outgoing data (builders)

Assuming the IDL FooStruct from above, the proc-macro will generate a FooStructBuilder as well.
This type is just a builder pattern for the corresponding struct, where the struct can be created incrementally, by writing its members into a TLVWrite trait instance in a "type-safe" way.

This is also done as a way to consume minimal memory, as the Builder is just a wide pointer as well (a mut writer reference) and the writer writes directly into the outgoing TX packet.

This and the previous pattern can be observed "in action" in the cluster_basic_information, general_commissioning, cluster_on_off and a few others which were only partially converted to IDL-generated code, but are now completely based on IDL-generated code.

(NEW) CLUSTER cluster Meta-Data

The static CLUSTER: Cluster<'static> = Cluster {} meta-data describing the cluster attributes and commands is now completely auto-generated as well. Used by the framework for path expansion, permissions checking and others.

(NEW) Cluster-specific Handlers

While we have general sync and async cluster handlers in the form:

/// A version of the `AsyncHandler` trait that never awaits any operation.
///
/// Prefer this trait when implementing handlers that are known to be non-blocking.
pub trait Handler {
    fn read(
        &self,
        exchange: &Exchange,
        attr: &AttrDetails,
        encoder: AttrDataEncoder,
    ) -> Result<(), Error>;

    fn write(
        &self,
        _exchange: &Exchange,
        _attr: &AttrDetails,
        _data: AttrData,
    ) -> Result<(), Error> {
        Err(ErrorCode::AttributeNotFound.into())
    }

    fn invoke(
        &self,
        _exchange: &Exchange,
        _cmd: &CmdDetails,
        _data: &TLVElement,
        _encoder: CmdDataEncoder,
    ) -> Result<(), Error> {
        Err(ErrorCode::CommandNotFound.into())
    }
}

...these are obviously very generic in that their layout is not tailored to the concrete cluster which the user needs to implement, with its concrete attributes and commands.

So now, the proc-macro generates cluster-specific handlers which are much more straight-forward in terms of implementation.

Here's the implementation of the General Commissioning cluster, using the auto-generated GeneralCommissioningHandler:

#[derive(Clone)]
pub struct GenCommCluster<'a> {
    dataver: Dataver,
    commissioning_policy: &'a dyn CommissioningPolicy,
}

impl<'a> GenCommCluster<'a> {
    pub const fn new(dataver: Dataver, commissioning_policy: &'a dyn CommissioningPolicy) -> Self {
        Self {
            dataver,
            commissioning_policy,
        }
    }
}

impl GeneralCommissioningHandler for GenCommCluster<'_> {
    fn dataver(&self) -> u32 {
        self.dataver.get()
    }

    fn dataver_changed(&self) {
        self.dataver.changed();
    }

    fn breadcrumb(&self, _exchange: &Exchange<'_>) -> Result<u64, Error> {
        Ok(0) // TODO
    }

    fn set_breadcrumb(&self, _exchange: &Exchange<'_>, _value: u64) -> Result<(), Error> {
        Ok(()) // TODO
    }

    fn basic_commissioning_info<P: TLVBuilderParent>(
        &self,
        _exchange: &Exchange<'_>,
        builder: BasicCommissioningInfoBuilder<P>,
    ) -> Result<P, Error> {
        builder
            .fail_safe_expiry_length_seconds(self.commissioning_policy.failsafe_expiry_len_secs())?
            .max_cumulative_failsafe_seconds(self.commissioning_policy.failsafe_max_cml_secs())?
            .finish()
    }

    fn regulatory_config(
        &self,
        _exchange: &Exchange<'_>,
    ) -> Result<RegulatoryLocationTypeEnum, Error> {
        Ok(RegulatoryLocationTypeEnum::IndoorOutdoor)
    }

    fn location_capability(
        &self,
        _exchange: &Exchange<'_>,
    ) -> Result<RegulatoryLocationTypeEnum, Error> {
        Ok(self.commissioning_policy.location_cap())
    }

    fn supports_concurrent_connection(&self, _exchange: &Exchange<'_>) -> Result<bool, Error> {
        Ok(self.commissioning_policy.concurrent_connection_supported())
    }

    fn handle_arm_fail_safe<P: TLVBuilderParent>(
        &self,
        exchange: &Exchange<'_>,
        request: ArmFailSafeRequest,
        response: ArmFailSafeResponseBuilder<P>,
    ) -> Result<P, Error> {
        let status = CommissioningErrorEnum::map(exchange.with_session(|sess| {
            exchange
                .matter()
                .failsafe
                .borrow_mut()
                .arm(request.expiry_length_seconds()?, sess.get_session_mode())
        }))?;

        response.error_code(status)?.debug_text("")?.finish()
    }

    fn handle_set_regulatory_config<P: TLVBuilderParent>(
        &self,
        _exchange: &Exchange<'_>,
        _request: SetRegulatoryConfigRequest,
        response: SetRegulatoryConfigResponseBuilder<P>,
    ) -> Result<P, Error> {
        // TODO
        response
            .error_code(CommissioningErrorEnum::OK)?
            .debug_text("")?
            .finish()
    }

    fn handle_commissioning_complete<P: TLVBuilderParent>(
        &self,
        exchange: &Exchange<'_>,
        response: CommissioningCompleteResponseBuilder<P>,
    ) -> Result<P, Error> {
        let status = CommissioningErrorEnum::map(exchange.with_session(|sess| {
            exchange
                .matter()
                .failsafe
                .borrow_mut()
                .disarm(sess.get_session_mode())
        }))?;

        if matches!(status, CommissioningErrorEnum::OK) {
            // As per section 5.5 of the Matter Core Spec V1.3 we have to teriminate the PASE session
            // upon completion of commissioning
            exchange
                .matter()
                .pase_mgr
                .borrow_mut()
                .disable_pase_session(&exchange.matter().transport_mgr.mdns)?;
        }

        response.error_code(status)?.debug_text("")?.finish()
    }
}

Also note how the methods corresponding to commands and attributes take our minimal TLVElement-wrapping structs as input, and expect *Builder structs as output.

Additionally, the proc-macro generates an adaptor struct, that adapts e.g. a GeneralCommissioningHandler trait implementation to the generic Handler implementation from above, that the rs-matter IM/DM layer understands.

Why is the changeset so big?

The changes outside of rs-matter-macros-impl are actually small and relatively incremental. They are everywhere though, as I had to touch several cross-cutting aspects:

Most info! logging lowered to debug!

Outputting rs-matter packets should not be with level info!, as info! should be the default mode of operation of rs-matter. Moreover, we now log even more stuff - as in, the proc-macro-generated cluster-adaptors are logging their attribute and command input data, as well as the output data, which is very handy, but makes the logging even more verbose. This is all now debug! logging, as it should be.

Macro-renames: cluster_attrs and supported_attributes combined and renamed to attributes!, supported_commands! / generated_commands! combined and renamed to just commands!

The layout of the Cluster meta-data struct was changed slightly to accommodate the new Command type (similar to the existing Attribute. We need it so that we can describe the required access for each command, which was a long-standing TODO.

As a result, it made sense to simplify and unify the existing macros.

Handler / AsyncHandler traits' methods simplified with ReadContext, WriteContext and InvokeContext

The method signatures of our handlers (this is valid for the new proc-macro-generated handlers too) were becoming a bit unwieldy with too many parameters.

The above *Context structures combine what used to be separate method parameters into a single structure.

Notifying subscribes on an attribute change from within the cluster handler now works

Turns out, this wasn't possible before, because our handlers simply did not have access to the Subscriptions struct. Now possible and one more reason for the *Context variables from above.

Examples streamlined; speaker example resurrected

The speaker example (MediaPlayback cluster impl) was sitting commented out for > 1 year, as I was not so sure we have to "manually" implement "helper" clusters for this and that, as the example did.

With the new proc-macro this is now not necessary. The speaker example showcases how users can use the import! proc-macro in their own code, outside of rs-matter so as to get all definitions for a cluster, and implement the cluster-specific handler.

Also, the examples are streamlined in their own examples project now. They are also simplified in that we do not need to talk/document the no_std use cases, as these are taken over by rs-matter-embassy.

@ivmarkov
Copy link
Contributor Author

Ouch. Trying to parse the UnitTesting cluster revealed that the IDL parser is broken logically with regards to parsing whitespace, as it does not really - at least at a few places I found - distinguish between significant and insignificant whitespace.

Whitespace is significant after identifier and numeric tokens in that if the following token is not a punctuation / operation token (i.e. a /*, /** or // comment or a ",;<>[]="), it needs to have at least one whitespace character before proceeding to the next token, which might itself be an identifier or a numeric token.

The net result is that the parser currently parses

attribute NullablesAndOptionalsStruct listNullablesAndOptionalsStruct[] = 35;

as

attribute nullable sAndOptionalsStruct listNullablesAndOptionalsStruct[] = 35;

Usually, having a separate lexer/tokenizer that works on the character level and emits a higher-level Token enum helps handle that problem, and also hides the notion of "whitespace" from the actual parser completely. Which in turns helps simplifying the actual parser, as it does not need to "think" anymore where whitespace is optional (insignificant) versus significant. As it operates on lexer tokens rather than on raw character level strings.

Alas, our parser does not have a formal lexer, so whitespace0 / whitespace1 is everywhere, and is easy to get wrong.
I'll try to fix it as-is but yeah.

@ivmarkov ivmarkov force-pushed the idl-gen-improvements branch from 045300f to c017b37 Compare April 22, 2025 07:42
@ivmarkov
Copy link
Contributor Author
ivmarkov commented Apr 22, 2025

OK.

The whitespace bug is now fixed in commit 6c46c6a by replacing all tag_no_case calls which were so far used to grab a keyword, with a dedicated keyword parser that operates in a greedy manner:

  • It first calls parse_id so as to (potentially) grab the complete following alphanumeric string (if any)
  • Then it compares the grabbed alphanumeric string with the expected keyword, and fails (and returns the original span) if the grabbed alphanumeric does not match the expected keyword

The change is small but looks large as I had to regexp almost all usages of the non-greedy tag_no_case with keyword.

I'm however now having second thoughts (sorry Andrey!) if using the nom thing in the first pace was a good idea.

Anyway, I've captured my thoughts here: #234

Still, to be also checked if parsing the IDL is the thing dominating the proc-macro. It might well be that the code-generation is not very efficient as well, given how big/comprehensive it had become, or it might be that rustc compiling the generated code is not very efficient either.

@ivmarkov ivmarkov force-pushed the idl-gen-improvements branch 3 times, most recently from 6ca5baa to ee00ba4 Compare April 24, 2025 08:42
@ivmarkov ivmarkov force-pushed the idl-gen-improvements branch from 2599b39 to 03daa3a Compare April 27, 2025 12:58
@ivmarkov ivmarkov force-pushed the idl-gen-improvements branch from 03daa3a to 6178f15 Compare April 27, 2025 15:08
@ivmarkov ivmarkov force-pushed the idl-gen-improvements branch from 624820b to ffedf69 Compare April 28, 2025 11:02
@ivmarkov ivmarkov force-pushed the idl-gen-improvements branch from ffedf69 to 4dbc9bd Compare April 28, 2025 11:18
@ivmarkov ivmarkov force-pushed the idl-gen-improvements branch 2 times, most recently from bb36543 to b14adbb Compare May 2, 2025 07:03
@ivmarkov ivmarkov force-pushed the idl-gen-improvements branch from b14adbb to 88bf6d8 Compare May 2, 2025 07:10
@ivmarkov ivmarkov marked this pull request as ready for review May 3, 2025 11:22
@ivmarkov ivmarkov requested a review from kedars May 3, 2025 11:23
@ivmarkov
Copy link
Contributor Author
ivmarkov commented May 3, 2025

@kedars This is now ready for review.

I'm sorry for the relatively large changeset outside of rs-matter-macros-impl and inside rs-matter, but it was necessary if we wanted to have ergonomic auto-generated cluster handlers. The rs-matter changeset is also described in the PR summary.

Net-net I'm quite happy though.

Look at the resurrected speaker example. Its handler is:

  • Memory efficient (input data is lazily deserialized from TLV and output data is generated incrementally using strongly-typed builders). Minimal stack usage which is crucial for MCUs
  • Strongly typed - a separate strongly-typed method for each attribute read, attribute write and command invocation
  • Has built-in logging of input/output data which can be seen with RUST_LOG=debug

Note that there will be three additional PRs upcoming (as this one grew quite large anyway):

  • PR 1 - convert ALL of the rs-matter system clusters to import! rather than manually implementing cluster handlers and cluster metadata. In the current PR I only converted the "BasicInformation" cluster, and finished the half-converted Gen comm and Eth diag clusters
  • PR 2 - fold rs-matter-macros-impl and rs-matter-data-model back into rs-matter-macros. Originally these were split as separate crates as Andrey complained unit testing does not work for proc-macro crates (rs-matter-macros) but I think I found a way to make it work. This way we'll publish on crates.io just two crates, which is simpler for us and the users
  • PR 3 - refactor-rename a bunch modules which addresses Come up with and then apply a more consistent naming convention #173

... and then of course resuming the work on C++ Python / XML tests, which was the reason why this work was initiated originally. As in, implementing the UnitTesting cluster manually is very difficult and not worth it.

@ivmarkov
Copy link
Contributor Author
ivmarkov commented May 3, 2025

@kedars One more note:

if you look at the speaker example, the import!ed cluster-handlers have method signatures for each command defined in the cluster.

Since the commands (unlike attributes) are not marked as "optional" in the IDL, we cannot really provide default "Unimplemented command" impl for "optional" commands which the user might not want to implement in its cluster handler. Therefore currently methods for ALL commands are generated and the user is expected to throw an "Unimplemented command" error in the command handlers which are not enumerated as supported in the cluster meta-data.

An alternative to the above would be to provide default method impls throwing "Unimplemented command" for ALL cluster commands, and then the user can override only those methods, which are for commands its cluster implements. This would cut down the verbosity of the cluster impl a bit, but I'm unsure about that. What do you think?

An even third option would be to enhance the import! macro a bit so that the user can provide a list of to-be implemented attributes / commands (and a cluster revision) and then the cluster handler and its cluster meta-data would be generated to contain only the user-specified attributes/commands. I'm unsure about that though as it would complicate the proc-macro generator even more. Perhaps, we can introduce this at a later stage, if deemed necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant
0