8000 fix(config): prohibit multiline interpolation [WIP] by pront · Pull Request #22995 · vectordotdev/vector · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

fix(config): prohibit multiline interpolation [WIP] #22995

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 23 commits into
base: master
Choose a base branch
from

Conversation

pront
Copy link
Member
@pront pront commented May 5, 2025

Summary

Warning

This PR has a very wide surface area. I will think if it is possible to split it.
Also, it is not actively worked on.
YAML config loading seems to be working fine, but TOML configs have a few bugs.

The envinronment variable (and the secrets) resolution now happens after the config string is parsed into a TOML table. The interpolation now happens on the table keys and values. As a side effect, this fixes a bug where comments referring to env vars (or secrets) that don't exist caused a config build error.

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Injecting whole blocks now results in error e.g.

export SOURCES_BLOCK="sources:\"
  demo:
    type: demo_logs
    format: json
    interval: 1
${SOURCES_BLOCK}

The config above will fail to load.

How did you test this PR?

Vector config:

secret:
  backend_1:
    type: file
    path: /Users/pavlos.rontidis/CLionProjects/vector/pront/data/top_secret.json

# ${IN_COMMENT_BUT_DOES_NOT_EXIST}

sources:
  demo_logs_1:
    type: demo_logs
    format: json
    interval: 60
    # noop: "SECRET[i_dont_exist]"

transforms:
  t0:
    inputs:
      - demo_logs_1
    type: "remap"
    source: |
      .host = "${HOSTNAME}"
      .environment = "${ENV:?emv must be supplied}"
      .tenant = "${TENANT:-undefined}"
      .day = "SECRET[backend_1.day]"


sinks:
  s0:
    type: "SECRET[backend_1.type]"
    inputs: [ "t0" ]
    encoding:
      codec: json
      json:
        pretty: true

Output:

{
  "day": "monday",
  "environment": "${ENV:?emv must be supplied}",
  "host": "${HOSTNAME}",
  "message": "{\"host\":...",
  "service": "vector",
  "source_type": "demo_logs",
  "tenant": "${TENANT:-undefined}",
  "timestamp": "2025-05-05T20:52:42.024064Z"
}

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the "no-changelog" label to this PR.

Notes

  • Please read our Vector contributor resources.
  • Do not hesitate to use @vectordotdev/vector to reach out to us regarding this PR.
  • The CI checks run only after we manually approve them.
    • We recommend adding a pre-push hook, please see this template.
    • Alternatively, we recommend running the following locally before pushing to the remote branch:
      • cargo fmt --all
      • cargo clippy --workspace --all-targets -- -D warnings
      • cargo nextest run --workspace (alternatively, you can run cargo test --all)
      • ./scripts/check_changelog_fragments.sh
  • After a review is requested, please avoid force pushes to help us review incrementally.
    • Feel free to push as many commits as you want. They will be squashed into one before merging.
    • For example, you can run git merge origin master and git push.
  • If this PR introduces changes Vector dependencies (modifies Cargo.lock), please
    run cargo vdev build licenses to regenerate the license inventory and commit the changes (if any). More details here.

References

@datadog-vectordotdev
Copy link
datadog-vectordotdev bot commented May 5, 2025

Datadog Report

Branch report: pront/intepolation-after-parsing
Commit report: 6204310
Test service: vector

✅ 0 Failed, 3 Passed, 0 Skipped, 2.13s Total Time

@pront pront changed the title fix(config): disallow multiline interpolation fix(config): prohibit multiline interpolation May 6, 2025
@@ -0,0 +1,297 @@
use std::{collections::HashMap, sync::LazyLock};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved vars.rs to this location. But kept the same use/export.

deserialize_table(table)
}

pub fn resolve_environment_variables(table: Table) -> Result<Table, Vec<String>> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was extracted from prepare_input

@pront pront marked this pull request as ready for review May 6, 2025 21:19
@pront pront requested a review from a team as a code owner May 6, 2025 21:19
@pront pront requested a review from jszwedko May 6, 2025 21:20
jszwedko

This comment was marked as outdated.

@@ -88,11 +89,10 @@ impl SecretBackendLoader {
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: Currently (on master) we don't support multiple types for secrets.

2025-05-07T14:38:00.776983Z ERROR vector::cli: Configuration error. error=Error while retrieving secret from backend "backend_1": invalid type: integer `1`, expected a string at line 1 column 10.

However, interpolate_secrets supports multiple types. I consider this out of scope for this PR but we could go further and allow e.g. integer secrets.

@pront
Copy link
Member Author
pront commented May 7, 2025

With COUNT=1 gives:

2025-05-06T21:40:58.830829Z  INFO vector::app: Log level is enabled. level="info"
2025-05-06T21:40:58.831986Z  INFO vector::app: Loading configs. paths=["/tmp/tmp.yaml"]
2025-05-06T21:40:58.838663Z ERROR vector::cli: Configuration error. error=invalid type: string "1", expected usize

Fixed and added tests to cover this.

jszwedko

This comment was marked as outdated.

@pront

This comment was marked as outdated.

@pront pront changed the title fix(config): prohibit multiline interpolation fix(config): prohibit multiline interpolation [WIP] May 12, 2025
@pront
Copy link
Member Author
pront commented May 13, 2025

So there are some tests failures:

  1. existing e2e regression
  2. this PR broke some tests (no_proxy is set to a string but schema specifies array of strings)

Will fix (1) in a separate PR
Will fix (2) in this PR (after 1 is fixed)

@pront
Copy link
Member Author
pront commented Jun 18, 2025

Still testing this. I found at least one bug for TOML configs. YAML configs seems to be working fine.

@pront pront marked this pull request as draft June 19, 2025 15:15
@pront pront removed the request for review from a team June 20, 2025 13:49
@pront pront force-pushed the pront/intepolation-after-parsing branch from 62d735b to 2cfe741 Compare June 20, 2025 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Secrets in commented-out lines in vector.toml cause config validation to fail Commented environment variables are still attempted to be interpolated
2 participants
0