8000 Store errored.tfstate in TF_DATA_DIR rather than working directory · Issue #37251 · hashicorp/terraform · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Store errored.tfstate in TF_DATA_DIR rather than working directory #37251
Open
@l3rady

Description

@l3rady

Terraform Version

1.11.3

Use Cases

When using Terraform to apply changes to multiple workspaces in parallel, especially in automation or scripting scenarios, it's important to maintain workspace-specific state isolation. If an apply operation fails (e.g., due to expired credentials or runtime errors), Terraform writes the errored state to the working directory by default, rather than to the directory defined by the TF_DATA_DIR environment variable.

This behavior can lead to conflicts and state loss in parallelized environments where multiple applies are running simultaneously, each expecting to operate in isolated directories. Losing this state complicates recovery and troubleshooting, especially when applying to many workspaces concurrently.

My end goal is to ensure that each failed apply operation writes its errored state to a unique and predictable location that matches the rest of Terraform's directory structure and honors the workspace's TF_DATA_DIR.

Attempted Solutions

To parallelize Terraform applies across multiple workspaces, I used a custom Bash function that sets a temporary TF_DATA_DIR for each apply:

TEMP_DIR=$(mktemp -d)
export TF_DATA_DIR="$TEMP_DIR"

cp $WORKDIR/.terraform/terraform.tfstate $TEMP_DIR/terraform.tfstate
ln -s $WORKDIR/.terraform/modules $TEMP_DIR/modules
ln -s $WORKDIR/.terraform/providers $TEMP_DIR/providers

terraform workspace select "$WS"
terraform apply -auto-approve "$PLAN_FILE"

This approach ensures that each apply operation has its own isolated working state.

However, when credentials expired mid-apply, Terraform failed to save the state to the backend, and the fallback errored state file was written to the working directory, not the TF_DATA_DIR. Because multiple applies were running in parallel, they all wrote to the same fallback location (e.g., errored.tfstate), overwriting each other. This resulted in all but one errored state being lost, leaving me with no way to recover or inspect the failed workspaces.

I attempted to mitigate this by managing TF state and data directories more carefully, but the fallback behavior is hardcoded and cannot be redirected.

Proposal

I propose that Terraform should either:

  1. Write the fallback errored.tfstate file to the configured TF_DATA_DIR, instead of the working directory; or
  2. Provide a configuration option (e.g., a CLI argument or environment variable like TF_ERROR_STATE_PATH) to explicitly control where the fallback errored state is written.

This change would ensure consistency with how Terraform otherwise respects TF_DATA_DIR and allow advanced users to safely manage parallel apply operations without risking state conflicts.

For example:

export TF_ERROR_STATE_PATH="$TF_DATA_DIR/errored.tfstate"

or automatically write:

$TF_DATA_DIR/errored.tfstate

This is a niche use case, and I understand it may not be a priority. However, for teams using Terraform in parallel workflows, especially with large workspace counts and CI/CD automation, this small change would dramatically improve reliability and reduce recovery pain during partial apply failures.

Thank you for considering this request, and for your excellent work on Terraform!

References

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0