doc: proposal for metadata discovery #4761

tzaeschke · 2025-05-02T11:23:05Z

This design document proposed an approach to improve availability of path metadata.
Initial discussion (very short): #4742

jiceatscion · 2025-05-02T11:23:12Z

This change is

jiceatscion

Reviewed 1 of 2 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @tzaeschke)

doc/dev/design/metadata-discovery.rst line 60 at r2 (raw file):

The ``staticInfoConfig.json`` may remain the primary way to store and exchange
metadata info. However, even if it would be replaced by APIs (e.g. gRPC calls),
auto-reading the file is useful when the file is edited manually.

I would tend to react with "here comes trouble". If an API can update the data, then do these updates also result in writing to the file? What if someone had just modified it? Are the changes lost in the collision? I think it might be best to chose whether the file is for persistence, transmission or input.

Code quote:

auto-reading the file is useful when the file is edited manually.

doc/dev/design/metadata-discovery.rst line 75 at r2 (raw file):

  ``"<ISD-AS> : All data autogenerated by Software ABC v1.42"``.
* Trigger collection of metadata or directly collect it.
* Store update the ``staticInfoConfig.json`` file with recent metadata.

"Store updates to the"?

Code quote:

Store update the

doc/dev/design/metadata-discovery.rst line 81 at r2 (raw file):

  ``staticInfoConfig.json`` file or maybe additionally via an API.
* Detect changes to the topology (new links, border routers, ...). If a change is
  detected, administrators could be notified and/or the metadata could be

At the scale of a Tier 1 operator, that might not be realistic. Unless that's for data that changes extremely rarely.

Code quote:

administrators could be notified

doc/dev/design/metadata-discovery.rst line 82 at r2 (raw file):

* Detect changes to the topology (new links, border routers, ...). If a change is
  detected, administrators could be notified and/or the metadata could be
  adapted automatically (add detected data or remove obsolete data).

adopted?

Code quote:

adapted

doc/dev/design/metadata-discovery.rst line 121 at r2 (raw file):

by administrators to monitor metadata and edit non-measurable metadata
(notes, addresses, more accurate geolocation, ...). However, this is optional
and can be done by monitoring or manually editing the ``staticInfoConfig.json`` file.

If that file is used to convey manual overrides, you would have to define (possibly in another file) which items are manually overridden and so must not be updated from the censor values.

Code quote:

``staticInfoConfig.json``

doc/dev/design/metadata-discovery.rst line 143 at r2 (raw file):

location address, may be difficult or impossible to detect automatically.
We would need to find sensible default values that ideally indicate that the
data was auto generated.

Not sure I understand what you mean: if these defaults are found, does it not mean that the data was not generated at all?

Code quote:

data was auto generated.

8000

doc/dev/design/metadata-discovery.rst line 162 at r2 (raw file):

  what metadata the control service is using. If the metadata service
  is separate, it could only report what was communicated to the CS, not
  what the CS is actually using.

I think that distinction is somewhat fictitious. The two things can live in the same server and still fail to cooperate. So, monitoring what the CS is using would always be needed, independently of what the MS is doing.

Code quote:

* When a remote monitoring API is implemented, it can monitor directly
  what metadata the control service is using. If the metadata service
  is separate, it could only report what was communicated to the CS, not
  what the CS is actually using.

tzaeschke

Reviewable status: 1 of 2 files reviewed, 7 unresolved discussions (waiting on @jiceatscion)

doc/dev/design/metadata-discovery.rst line 60 at r2 (raw file):

Previously, jiceatscion wrote…

I would tend to react with "here comes trouble". If an API can update the data, then do these updates also result in writing to the file? What if someone had just modified it? Are the changes lost in the collision? I think it might be best to chose whether the file is for persistence, transmission or input.

Yes that may cause problems. However, they could also use the API in parallel, or they could work on config file in parallel (one editing it, another copying over shortly afterwards with a locally edited copy,

Not sure how to protect against uncoordinated admin.

However, this is just a proposal, we could also do away with reparsing an updated file? It really depends on whether we think admins prefer editing a file or using an API.

Thoughts?

doc/dev/design/metadata-discovery.rst line 75 at r2 (raw file):

Previously, jiceatscion wrote…

"Store updates to the"?

Thanks, fixed.

doc/dev/design/metadata-discovery.rst line 81 at r2 (raw file):

Previously, jiceatscion wrote…

At the scale of a Tier 1 operator, that might not be realistic. Unless that's for data that changes extremely rarely.

I was primarily thinking about detecting conflicts in the topology file vs the staticInfoConfig file. I clarified this.
I assume changes should be rare, not sure how often an AS adds links or border routers?

doc/dev/design/metadata-discovery.rst line 82 at r2 (raw file):

Previously, jiceatscion wrote…

adopted?

I think I mean adapted, the metadata is adapted to fit the modified topology...

doc/dev/design/metadata-discovery.rst line 121 at r2 (raw file):

Previously, jiceatscion wrote…

If that file is used to convey manual overrides, you would have to define (possibly in another file) which items are manually overridden and so must not be updated from the censor values.

My idea was as follow, but I realize now it is broken, so just for reference: Overriding was not the intention. With "non-measurable data" I meant data the the control server cannot get anywhere else but from what the admin provides. I see that this is a bit problematic because for some data the source of truth is the control server (measured data: latency, hop count), for other data (addresses, notes) it is whatever the admin says.

It is broken because some data needs to be overridden. For example, the geolocation (long/lat) can be derived from a geolocation service but should obviously be overridable. It may also be desirable to override other measurements, e.g. latency.

Request for discussion: How should we handle this?

We could, as you suggest, specify which attributes have manual values and which may contain measurements.
- At which granularity? Is it sufficient to specify a section that specify overridability per data type? E.g. override_geolocation = true; override_latency=false?
- should we add an optional flag to each value, e.g. override=true?
- As you suggested, should we put override info into a separate file? I assume you mean the actual values, not a flag that indicates overrides?

Suggetions?

doc/dev/design/metadata-discovery.rst line 143 at r2 (raw file):

Previously, jiceatscion wrote…

Not sure I understand what you mean: if these defaults are found, does it not mean that the data was not generated at all?

I'm not sure I understand. I mean that if the default values are found it means that the data was (auto) generated.
The default values are all values that are unlikely to occur due to measurements. For non-measurable text values,
the text could just contain "(auto generated)".

Maybe I misunderstand the question?

doc/dev/design/metadata-discovery.rst line 162 at r2 (raw file):

Previously, jiceatscion wrote…

I think that distinction is somewhat fictitious. The two things can live in the same server and still fail to cooperate. So, monitoring what the CS is using would always be needed, independently of what the MS is doing.

Yes, but if the metadata service is in the control service, we wouldn't need a separate API because they are the same.
Maybe the misunderstanding is that with "separate" I mean "separate process"?

tzaeschke

Thanks for the feedback! :)

Reviewable status: 1 of 2 files reviewed, 7 unresolved discussions (waiting on @jiceatscion)

jiceatscion

Reviewed 1 of 1 files at r3, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @tzaeschke)

doc/dev/design/metadata-discovery.rst line 60 at r2 (raw file):

Previously, tzaeschke (Tilmann) wrote…

Yes that may cause problems. However, they could also use the API in parallel, or they could work on config file in parallel (one editing it, another copying over shortly afterwards with a locally edited copy,

Not sure how to protect against uncoordinated admin.

However, this is just a proposal, we could also do away with reparsing an updated file? It really depends on whether we think admins prefer editing a file or using an API.

Thoughts?

Ok. It makes sense to persist the configuration (after API-driven changes) into a human readable file. That being the case, there's no legitimate way to prevent someone from editing it while the server is down. So, I guess we shouldn't go out of our way to prevent that. What I would do however is to make it clear (through doc, and in a heading comment in the file), that the file is read only upon startup and is liable for overwrite as long as the config API is active.

doc/dev/design/metadata-discovery.rst line 121 at r2 (raw file):

Previously, tzaeschke (Tilmann) wrote…

My idea was as follow, but I realize now it is broken, so just for reference: Overriding was not the intention. With "non-measurable data" I meant data the the control server cannot get anywhere else but from what the admin provides. I see that this is a bit problematic because for some data the source of truth is the control server (measured data: latency, hop count), for other data (addresses, notes) it is whatever the admin says.

It is broken because some data needs to be overridden. For example, the geolocation (long/lat) can be derived from a geolocation service but should obviously be overridable. It may also be desirable to override other measurements, e.g. latency.

Request for discussion: How should we handle this?

We could, as you suggest, specify which attributes have manual values and which may contain measurements.

At which granularity? Is it sufficient to specify a section that specify overridability per data type? E.g. override_geolocation = true; override_latency=false?

should we add an optional flag to each value, e.g. override=true?

As you suggested, should we put override info into a separate file? I assume you mean the actual values, not a flag that indicates overrides?

Suggetions?

What we could do if have configuration keys dedicated to overrides. For example. geolocation = and geolocation_override = . If the latter is present, then it applies and it cannot be auto-modified. We could define those keys for cases that make sense, and not for those that are only editable defaults; those that must be modifiable at run-time. If we find some third case (I don't know, something like an override that can be ignored in some cases), we can come up with some other suffix like "_preferred" for example. ...Aaanyway. it's minor stuff. We can figure it out as we go.

doc/dev/design/metadata-discovery.rst line 143 at r2 (raw file):

Previously, tzaeschke (Tilmann) wrote…

I'm not sure I understand. I mean that if the default values are found it means that the data was (auto) generated.
The default values are all values that are unlikely to occur due to measurements. For non-measurable text values,
the text could just contain "(auto generated)".

Maybe I misunderstand the question?

Well, it's hard for me to point out at what is wrong with the language (or with me). I just do not understand what you're trying to say here. It seems contradictory. How could some data have been autogenerated if it cannot be calculated? I think what you are trying to say is that the filed should say something to the effect that the data is meant to be autogenerated but has not been determined yet. Is that what you meant? At least that is what your example reflects; that, I understand.

doc/dev/design/metadata-discovery.rst line 162 at r2 (raw file):

Previously, tzaeschke (Tilmann) wrote…

Yes, but if the metadata service is in the control service, we wouldn't need a separate API because they are the same.
Maybe the misunderstanding is that with "separate" I mean "separate process"?

Well, I guess the API separation or not is the real concern. It seems to make a great deal more sense for a segment metadata API to be part of the segment API, rather than separate. That would be the main reason to put it all in the same bag. Adding more functionality to the CS is sort of a downside. It's already big. But hey, if it's for a good reason; so be it.

doc/dev/design/metadata-discovery.rst line 18 at r3 (raw file):

We believe that this is an important next step because metadata is essential
We believe that this is an important next step because metadata is essential

echo

tzaeschke

Reviewable status: 1 of 2 files reviewed, 4 unresolved discussions (waiting on @jiceatscion)

doc/dev/design/metadata-discovery.rst line 60 at r2 (raw file):

Previously, jiceatscion wrote…

Ok. It makes sense to persist the configuration (after API-driven changes) into a human readable file. That being the case, there's no legitimate way to prevent someone from editing it while the server is down. So, I guess we shouldn't go out of our way to prevent that. What I would do however is to make it clear (through doc, and in a heading comment in the file), that the file is read only upon startup and is liable for overwrite as long as the config API is active.

I added a sentence to that effect in the section Management API:
If we decide to have a remote monitoring API, in order to avoid concurrency issues we should probably remove the runtime reparsing of the file. Reparsing of the file would thus be an interim solution until the management API is available. At that point, the file should only be parsed at startup of the metadata service.

doc/dev/design/metadata-discovery.rst line 121 at r2 (raw file):

Previously, jiceatscion wrote…

What we could do if have configuration keys dedicated to overrides. For example. geolocation = and geolocation_override = . If the latter is present, then it applies and it cannot be auto-modified. We could define those keys for cases that make sense, and not for those that are only editable defaults; those that must be modifiable at run-time. If we find some third case (I don't know, something like an override that can be ignored in some cases), we can come up with some other suffix like "_preferred" for example. ...Aaanyway. it's minor stuff. We can figure it out as we go.

I added a section File Format that proposes an override attribute for all fields. However, I am currently not sure what the exact usecase would be, it seems most cases can be covered without such an attribute....?

doc/dev/design/metadata-discovery.rst line 143 at r2 (raw file):

Previously, jiceatscion wrote…

Well, it's hard for me to point out at what is wrong with the language (or with me). I just do not understand what you're trying to say here. It seems contradictory. How could some data have been autogenerated if it cannot be calculated? I think what you are trying to say is that the filed should say something to the effect that the data is meant to be autogenerated but has not been determined yet. Is that what you meant? At least that is what your example reflects; that, I understand.

I removed that block, I'm not sure anymore why I thought Completeness was is desirable.
If there is no value then that is just that.

doc/dev/design/metadata-discovery.rst line 162 at r2 (raw file):

Adding more functionality to the CS is sort of a downside.

Indeed :)
I summarized that loosely under Disadvantages: Feature overload of the control service

doc/dev/design/metadata-discovery.rst line 18 at r3 (raw file):

Previously, jiceatscion wrote…

echo

Fixed.

Tilmann Zäschke added 2 commits May 2, 2025 13:17

metadata discovery

ce9fce6

metadata discovery

80db0fc

Tilmann Zäschke added 4 commits May 2, 2025 13:25

metadata discovery

6898d82

metadata discovery

92006d9

metadata discovery

e8c6f17

metadata discovery

399f767

tzaeschke mentioned this pull request May 2, 2025

control: automate detection of segment metadata #4742

Closed

tzaeschke added the i/proposal A new idea requiring additional input and discussion label May 2, 2025

jiceatscion requested changes May 2, 2025

View reviewed changes

feedback

5c986a4

tzaeschke commented May 5, 2025

View reviewed changes

jiceatscion requested changes May 19, 2025

View reviewed changes

Tilmann Zäschke added 3 commits May 20, 2025 17:16

Feedback from JC

f93ee03

Feedback from JC

5e24cdf

Feedback from JC

7936ceb

tzaeschke commented May 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

doc: proposal for metadata discovery #4761

doc: proposal for metadata discovery #4761

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

doc: proposal for metadata discovery #4761

Are you sure you want to change the base?

doc: proposal for metadata discovery #4761

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!