-
Notifications
You must be signed in to change notification settings - Fork 1.7k
fix(config): prohibit multiline interpolation #22995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
B 8000 y clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Datadog ReportBranch report: ✅ 0 Failed, 7 Passed, 0 Skipped, 25.45s Total Time |
@@ -0,0 +1,297 @@ | |||
use std::{collections::HashMap, sync::LazyLock}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved vars.rs
to this location. But kept the same use/export.
deserialize_table(table) | ||
} | ||
|
||
pub fn resolve_environment_variables(table: Table) -> Result<Table, Vec<String>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was extracted from prepare_input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I'm glad to see us finally addressing this rough edge.
I think this works for string fields, but it doesn't seem to interpolate non-string fields correctly. For example:
sources:
source0:
type: demo_logs
count: ${COUNT}
format: json
decoding:
codec: json
framing:
method: bytes
interval: 1.0
sinks:
sink0:
inputs:
- source0
type: console
encoding:
codec: json
With COUNT=1
gives:
2025-05-06T21:40:58.830829Z INFO vector::app: Log level is enabled. level="info"
2025-05-06T21:40:58.831986Z INFO vector::app: Loading configs. paths=["/tmp/tmp.yaml"]
2025-05-06T21:40:58.838663Z ERROR vector::cli: Configuration error. error=invalid type: string "1", expected usize
in `sources.source0`
@@ -0,0 +1,20 @@ | |||
The env var (and secrets) resolution now happens after the config string is parsed into a TOML table. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the config always parsed into a TOML table? This explains the weird behavior I've seen before where I get TOML errors for YAML configs. 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, internally we call format::deserialize
which supports all formats and then we always convert to a TOML table after this step. I would expect this to translate to a config builder (out of scope for this PR).
|
||
pub fn resolve_environment_variables(table: Table) -> Result<Table, Vec<String>> { | ||
let mut vars = std::env::vars().collect::<HashMap<_, _>>(); | ||
if !vars.contains_key("HOSTNAME") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL this special case was here.
@@ -88,11 +89,10 @@ impl SecretBackendLoader { | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: Currently (on master) we don't support multiple types for secrets.
2025-05-07T14:38:00.776983Z ERROR vector::cli: Configuration error. error=Error while retrieving secret from backend "backend_1": invalid type: integer `1`, expected a string at line 1 column 10.
However, interpolate_secrets
supports multiple types. I consider this out of scope for this PR but we could go further and allow e.g. integer secrets.
Fixed and added tests to cover this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks. This fixes my first example. I think the optimistic parsing as integers, though, isn't quite right. It doesn't seem to handle quoted strings where the value is parsable as an integer. For example, this doesn't work:
sources:
source0:
type: demo_logs
count: ${COUNT}
format: shuffle
lines: ["${LINE}"]
framing:
method: bytes
interval: 1.0
sinks:
sink0:
inputs:
- source0
type: console
encoding:
codec: json
It gives:
COUNT=1 LINE=1234 ./target/debug/vector -c /tmp/tmp.yaml
2025-05-07T15:31:36.925469Z INFO vector::app: Log level is enabled. level="info"
2025-05-07T15:31:36.926534Z INFO vector::app: Loading configs. paths=["/tmp/tmp.yaml"]
2025-05-07T15:31:36.932941Z ERROR vector::cli: Configuration error. error=invalid type: integer `1234`, expected a string
in `sources.source0`
Taking a look.This all goes back to my previous comment that this intermediate TOML table representation is problematic because it doesn't use type information from the config types. |
Summary
The envinronment variable (and the secrets) resolution now happens after the config string is parsed into a TOML table. The interpolation now happens on the table keys and values. As a side effect, this fixes a bug where comments referring to env vars (or secrets) that don't exist caused a config build error.
Change Type
Is this a breaking change?
Injecting whole blocks now results in error e.g.
${SOURCES_BLOCK}
The config above will fail to load.
How did you test this PR?
Vector config:
Output:
Does this PR include user facing changes?
Notes
@vectordotdev/vector
to reach out to us regarding this PR.pre-push
hook, please see this template.cargo fmt --all
cargo clippy --workspace --all-targets -- -D warnings
cargo nextest run --workspace
(alternatively, you can runcargo test --all
)./scripts/check_changelog_fragments.sh
git merge origin master
andgit push
.Cargo.lock
), pleaserun
cargo vdev build licenses
to regenerate the license inventory and commit the changes (if any). More details here.References