8000 DEV-877: Split ignorer logic for Smart Build into 2 different functions according to the flag, and use a temporary file for the SHA calculation by ifbyol · Pull Request #4689 · okteto/okteto · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

DEV-877: Split ignorer logic for Smart Build into 2 different functions according to the flag, and use a temporary file for the SHA calculation #4689

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 25, 2025

Conversation

ifbyol
Copy link
Member
@ifbyol ifbyol commented Mar 24, 2025

Proposed changes

After the changes introduced to ignore defined in .dockerignore files in the Smart Build logics here, it turns out that for big repos with a lot of files, the command executed to calculate the SHA was failing, provoking that the image is built every time.

Even if there was a feature flag for the feature, the bug comes in a common place, so any could be affected independently of enabling the feature or not.

In this PR we are doing 2 things:

  • Only change the logic to calculate things when the feature flag is enabled. If disabled, old logic would be executed (restored functions as they were before the changes in the Smart Builds logic). The idea of this is to prevent users which don't have the feature enabled to have some impact due to the new functionality
  • Fix the way dir SHA is calculated to prevent that big repos can calculate it properly. We are creating a temporary file with the content of the git ls-objects command, to then, pass it to the git hash-object function, instead of doing a pipe as we were doing before

How to validate

There are a couple of ways for validating it.

You need to play with the environment variable OKTETO_SMART_BUILDS_IGNORE_FILES_ENABLED to enable and disable.

When the environment variable OKTETO_SMART_BUILDS_IGNORE_FILES_ENABLED set to false, if you run a build using the log-level debug you will see how this command is executed: /usr/bin/git --no-optional-locks ls-files -s <path-to-some-folder> | /usr/bin/git --no-optional-locks hash-object --stdin. This is to calculate the SHA of the directory.

Prior to this change, you can see how the following commands are executed:

  • /usr/bin/git --no-optional-locks ls-files -s. To list all the files in the directory
  • echo <filtered output of previous command> | /usr/bin/git --no-optional-locks hash-object --stdin To get the SHA

When the list of files was too long, it was generating an error, so image was being built always. With this PR, if the flag is disabled, we restore the original way of calculating the SHA, as no function to ignore files has to be applied, so no need to change the commands to execute.

Then, if you change the value of the environment variable OKTETO_SMART_BUILDS_IGNORE_FILES_ENABLED to true, and you see the logs, you will see now how the command to be used is different, fixing the issue when the repository has too many files.

You can see how now the command /usr/bin/git --no-optional-locks ls-files -s is executed first to get the list of files, and then, /usr/bin/git --no-optional-locks hash-object <base-path>/.okteto/git-untracked-backend-1742893596755404000 to calculate the SHA of the directory. It is using a temporary file to store the filtered output of the git ls-files command to prevent the issue of passing a huge argument to the exec command

ifbyol added 3 commits March 24, 2025 12:39
…g is disabled

Signed-off-by: Nacho Fuertes <nacho@okteto.com>
Signed-off-by: Nacho Fuertes <nacho@okteto.com>
Signed-off-by: Nacho Fuertes <nacho@okteto.com>
@ifbyol ifbyol added release/bug-fix run-e2e When used on a PR run windows & unix e2e backport release-3.9 Backport this PR to CLI versions 3.9 labels Mar 24, 2025
@ifbyol ifbyol requested a review from a team as a code owner March 24, 2025 19:02
Comment on lines -325 to +342
relpath := resolveRelativePath(filepath.Join(worktree.GetRoot(), f), contextDir)
shouldIgnore, err := r.ignorer(contextDir).Ignore(relpath)
if r.shouldIgnoreFiles {
worktree, err := repo.Worktree()
if err != nil {
oktetoLog.Debugf("ignore error in GetDiffHash: %v", err)
}
if !shouldIgnore {
filteredUntrackedFiles = append(filteredUntrackedFiles, f)
oktetoLog.Debugf("Failed to get worktree, skipping calculation of files to ignore: %v", err)
} else {
oktetoLog.Debugf("skipping %v file change in GetDiffHash based on known ignore files", f)
filteredUntrackedFiles := []string{}
for _, f := range untrackedFiles {
relpath := resolveRelativePath(filepath.Join(worktree.GetRoot(), f), contextDir)
shouldIgnore, err := r.ignorer(contextDir).Ignore(relpath)
if err != nil {
oktetoLog.Debugf("ignore error in GetDiffHash: %v", err)
}
if !shouldIgnore {
filteredUntrackedFiles = append(filteredUntrackedFiles, f)
} else {
oktetoLog.Debugf("skipping %v file change in GetDiffHash based on known ignore files", f)
}

}

untrackedFiles = filteredUntrackedFiles
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the new path. It is only considered when the feature flag is enabled

Comment on lines 367 to 386
f, err := lg.fs.Create(filepath.Join(config.GetOktetoHome(), fmt.Sprintf("git-untracked-%s-%d", filepath.Base(dirPath), time.Now().UnixNano())))
if err != nil {
oktetoLog.Debugf("failed to create untracked file: %v", err)
return "", err
}

defer func() {
err := lg.fs.Remove(f.Name())
if err != nil {
oktetoLog.Debugf("failed to close untracked file: %v", err)
}
}()

_, err = f.WriteString(filteredLS)
if err != nil {
oktetoLog.Debugf("failed to write untracked file: %v", err)
return "", err
}

hashObjectCmdArgs := []string{"--no-optional-locks", "hash-object", f.Name()}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the new logic to create a temporary file with the output of the ls-files command, instead of doing a pipe (the problem with big repos is that we were exceeding the maximum length of an argument

return handleError(err)
}
return string(output), nil
}

// getDirContentSHAWithoutIgnore calculates the SHA of the content of the given directory using git ls-files and git hash-object
func (lg *LocalGit) getDirContentSHAWithoutIgnore(ctx context.Context, gitPath, dirPath string, fixAttempt int) (string, error) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the content of the function GitDirContent before the change. It can be found here

}

// diffWithoutIgnore calculates the diff of the repository at the given path
func (lg *LocalGit) diffWithoutIgnore(ctx context.Context, gitPath, dirPath string, fixAttempt int) (string, error) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the implementation of the function Diff as it was before the problematic changes. It can be found here

)

type mockLocalGit struct {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we already have a mockLocalExec, I wasn't be able to use it as I had to mock 2 calls to RunCommand in the same test, and it wasn't allowing it. So, instead of modifying all the tests, I created another implementation using testify mocks for simplicity

ifbyol added 2 commits March 24, 2025 20:10
Signed-off-by: Nacho Fuertes <nacho@okteto.com>
Signed-off-by: Nacho Fuertes <nacho@okteto.com>
Copy link
codecov bot commented Mar 24, 2025

Codecov Report

Attention: Patch coverage is 48.91304% with 47 lines in your changes missing coverage. Please review.

Project coverage is 48.79%. Comparing base (d7ab9e5) to head (b458dfb).
Report is 2 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4689      +/-   ##
==========================================
+ Coverage   48.74%   48.79%   +0.05%     
==========================================
  Files         354      354              
  Lines       29401    29468      +67     
==========================================
+ Hits        14332    14380      +48     
- Misses      13936    13946      +10     
- Partials     1133     1142       +9     
🚀 New features to boost your workflow:
  • ❄ 8000 ️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

hashObjectCmdArgs := []string{"--no-optional-locks", "hash-object", "--stdin"}
output, err := lg.exec.RunPipeCommands(ctx, gitPath, "echo", []string{filteredLS}, lg.gitPath, hashObjectCmdArgs)
// As the list of files might be too long, we write it to a file and calculate the hash of the file
f, err := lg.fs.Create(filepath.Join(config.GetOktetoHome(), fmt.Sprintf("git-untracked-%s-%d", filepath.Base(dirPath), time.Now().UnixNano())))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really know if I like this. I think it should be preferable to write into a temple instead of okteto home.

If we have a problem removing it if we do it in a temp file when the computer restarts it will be deleted, while if we do this in the okteto home we could be increasing the folder

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have this proble 8000 m in other parts of the CLI like the custom kubeconfig for the deploys. We should unify and standardize how we do it, but I don't know if we should do it as part of this PR or not

Co-authored-by: Javier López Barba <javier@okteto.com>
@ifbyol ifbyol merged commit 2c924ed into master Mar 25, 2025
23 of 24 checks passed
@ifbyol ifbyol deleted the ifbyol/dev-877 branch March 25, 2025 15:04
github-actions bot pushed a commit that referenced this pull request Mar 25, 2025
…ns according to the flag, and use a temporary file for the SHA calculation (#4689)

* DEV-877: Made changes to not change the default behavior when the flag is disabled

Signed-off-by: Nacho Fuertes <nacho@okteto.com>

* DEV-877: Fix format of the file

Signed-off-by: Nacho Fuertes <nacho@okteto.com>

* DEV-877: Speed up SHA calculation creating a temporary file

Signed-off-by: Nacho Fuertes <nacho@okteto.com>

* DEV-877: Changed mock to see if a windows unit test is fixed

Signed-off-by: Nacho Fuertes <nacho@okteto.com>

* DEV-877: Fix indentation of test file

Signed-off-by: Nacho Fuertes <nacho@okteto.com>

* Update pkg/repository/local_git.go

Co-authored-by: Javier López Barba <javier@okteto.com>

---------

Signed-off-by: Nacho Fuertes <nacho@okteto.com>
Co-authored-by: Javier López Barba <javier@okteto.com>
(cherry picked from commit 2c924ed)
ifbyol added a commit that referenced this pull request Mar 25, 2025
…ns according to the flag, and use a temporary file for the SHA calculation (#4689) (#4690)

* DEV-877: Made changes to not change the default behavior when the flag is disabled

Signed-off-by: Nacho Fuertes <nacho@okteto.com>

* DEV-877: Fix format of the file

Signed-off-by: Nacho Fuertes <nacho@okteto.com>

* DEV-877: Speed up SHA calculation creating a temporary file

Signed-off-by: Nacho Fuertes <nacho@okteto.com>

* DEV-877: Changed mock to see if a windows unit test is fixed

Signed-off-by: Nacho Fuertes <nacho@okteto.com>

* DEV-877: Fix indentation of test file

Signed-off-by: Nacho Fuertes <nacho@okteto.com>

* Update pkg/repository/local_git.go

Co-authored-by: Javier López Barba <javier@okteto.com>

---------

Signed-off-by: Nacho Fuertes <nacho@okteto.com>
Co-authored-by: Javier López Barba <javier@okteto.com>
(cherry picked from commit 2c924ed)

Co-authored-by: Ignacio Fuertes <nacho@okteto.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport release-3.9 Backport this PR to CLI versions 3.9 release/bug-fix run-e2e When used on a PR run windows & unix e2e
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0