Add Eq, PartialEq, Serialize, Deserialize to Target #823

joshwlewis · 2024-05-16T14:50:17Z

Buildpacks (like heroku/buildpacks-go) may wish to store Target in layer metadata, so that caches may be invalidated when Target os, arch, distro, etc. changes. To store and retrieve Target data in toml, we need Deserialize and Serialize. To detect changes between two targets, PartialEq and Eq are also implemented.

Signed-off-by: Josh W Lewis <josh.lewis@salesforce.com>

Malax · 2024-05-16T17:09:08Z

If we want to implement Serialize and Deserialize, we probably should at least add some extra serde attributes for the optional field to avoid serialising them when they're None like we do for libcnb-data types. My gut feeling is that adding both implementations is a little weird, but I see the practical use for them.

What are your thoughts about the additional attributes @joshwlewis?

edmorley · 2024-05-16T17:14:34Z

I chose not to use Target directly in the layer metadata for Python, since:
(a) leaking the internal libcnb types into the layer metadata means having to handle forwards-backwards compat for layer metadata migration if the libcnb type were ever to change
(b) it felt more explicit to include the precise fields needed for invalidation into the layer metadata rather than putting everything in there by default. for example, some layers might not be distro or arch-specific

edmorley · 2024-05-16T17:18:41Z

Fwiw I'm fine with making Target serialisable if we think it's reasonable to use in some cases - I just wondered if there's a potential "what is the best practice here and should libcnb be opinionated about it or not?" discussion to be had? :-)

schneems · 2024-05-16T18:19:48Z

Short: I'm leaning towards not wanting Serialize and Deserialize implemented. I think Clone, Debug, Eq and PartialEq make sense as mechanical properties. My worry with allowing these to be serialized directly is that maintainers must be aware that changes to that struct can cause cache misses and users would get no compile errors or warnings and would have to check the changelog. We even recently changed the format of Target #821.

Long:
Originally, I made a smaller alias of Target called TargetId, which I also made Serialize and Deserialize. After some discussion with Rune I ended up moving to the model that Ed used with Python. I support both of Ed's reasons why, but wanted to add another:

Ownership over the changed message i.e. "CPU architecture changed from X to Y" instead of "Target changed: X-Y-Z-Q to X2-Y2-Z2-Q2" is less nice.

I've also experimented a bit with auto-generating the "changed" warning messages based on the serialized key https://github.com/heroku/buildpacks-ruby/pull/203/files#diff-1cedf1bfbafd5ca2b85e8261de347a2318ab3ea87abe49eeff117fb3f380faa9R150-R154. I think that approach works best when the desired key/value is stored at the top level instead of nested within a struct.

Another benefit to storing the data needed directly at the top level is migrating between metadata versions is fairly easy https://github.com/schneems/magic_migrate. However, if you need to start supporting multiple sub-migrations of associated structs, the picture gets much murkier.

Those are all from the buildpack maintainer angle. From the libcnb maintainer angle, I don't feel like Target is fully "baked" and will likely see some iteration/movement in the future. It feels like an action at a distance to have modifications to this struct trigger an unknown amount of cache clearing in libcnb user's apps.

joshwlewis · 2024-05-16T19:24:49Z

we probably should at least add some extra serde attributes for the optional field to avoid serialising them when they're None like we do for libcnb-data types.

Seems fine. Will do!

My gut feeling is that adding both implementations is a little weird, but I see the practical use for them.

I'm not following what's weird about both implementations. Care to elaborate?

leaking the internal libcnb types into the layer metadata means having to handle forwards-backwards compat for layer metadata migration if the libcnb type were ever to change

This is a strong argument. There's a fair chance the data format changes in the future, which would potentially invalidate otherwise valid caches. In the go use case, I don't care a whole lot though, since it's all intermediate compilation artifacts which are invalidated frequently anyway (go patch version changes, n number of uses).

it felt more explicit to include the precise fields needed for invalidation into the layer metadata rather than putting everything in there by default. for example, some layers might not be distro or arch-specific

I see the value of being explicit here, but does every layer in every buildpack need to reimplement this logic?

Ownership over the changed message i.e. "CPU architecture changed from X to Y" instead of "Target changed: X-Y-Z-Q to X2-Y2-Z2-Q2" is less nice.

The go buildpack doesn't explain why the build cache expired. It's supposed to be optional and ephemeral, so not a big deal.

In short, I think this makes sense for Go, and possibly other buildpacks that need wholesale invalidation on a target change. Do we want to force buildpack authors to write this themselves? My opinion is that this is a pretty low-cost convenience in some scenarios, and totally optional - buildpack authors can still choose their own cache expiration.

joshwlewis · 2024-05-16T21:46:02Z

we probably should at least add some extra serde attributes for the optional field to avoid serialising them when they're None like we do for libcnb-data types.

It looks like we really only do that for Vec<_> types elsewhere in libcnb-data. As it is in this PR arch_variant (which is Option<String>) doesn't get serialized when None, as expected.

joshwlewis · 2024-05-16T21:48:10Z

Given the resistance I'm seeing, I'll close this. It's not a big deal to handle target invalidation manually for one buildpack.

schneems · 2024-05-17T16:47:47Z

I see the value of being explicit here, but does every layer in every buildpack need to reimplement this logic?

Brainstorming another path forward:

Introduce a smaller struct with a more stable interface for instance DistroVersionArch where that is Serialize and Deserialize and can easily be built from a Target. In this example, we promise that this struct has 3 properties and will they will not change. Then if it needs to change, we introduce another struct and deprecate that one so people get a warning or compile error. This buys us the ability to change Target and makes it a little easier for buildpack maintainers who want to invalidate when the OS changes with as little code as possible.

I picked those three values because they're what Ruby and Python both needed.

If you really want it to be invalidated on ANY change to Target, the struct is Debug so you could make your metadata take a String and then store the Debug string of the Target. I wouldn't personally recommend it, but hey...it's technically possible.

Add Eq, PartialEq, Serialize, Deserialize to Target

a15a19b

Signed-off-by: Josh W Lewis <josh.lewis@salesforce.com>

joshwlewis force-pushed the jwl/serialize-deserialize-target branch from 78d788f to a15a19b Compare May 16, 2024 14:52

joshwlewis marked this pull request as ready for review May 16, 2024 14:54

joshwlewis requested a review from a team as a code owner May 16, 2024 14:54

joshwlewis closed this May 16, 2024

edmorley added enhancement New feature or request libcnb labels May 16, 2024

edmorley removed the request for review from a team May 17, 2024 06:56

edmorley deleted the jwl/serialize-deserialize-target branch June 18, 2024 11:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Eq, PartialEq, Serialize, Deserialize to Target #823

Add Eq, PartialEq, Serialize, Deserialize to Target #823

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Add Eq, PartialEq, Serialize, Deserialize to Target #823

Add Eq, PartialEq, Serialize, Deserialize to Target #823

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!