Fix train state and modification time for unfinished project training #722
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
If initial training of a project is not finished after any files have been created in the project's data directory, the train state and modification time information turn out incorrect (the project shows to be (fully) trained when it is not, and with a modification time). And when retraining a project is interrupted, the modification time is falsely updated.
This problem can realize more commonly when/if implementing the
--prepare-only
option to the train command.This PR makes the methods inquiring the train state and modification time to ignore files in the project's datadir with pattern
*-train*
,tmp-*
andvectorizer
. Thetmp-
prefix is added to all temporary files, because some backends are using a tempfile for the model file during training, which can remain after unfinished training, e.g.stwfsa_predictor1mz8z4im.zip
.The train and temp files should definitely be ignored, but the vectorizer file case is not so clear:
Instead of using global ignore patterns, this functionality could use the actual model file names/patterns per backend, but the field storing them varies a bit (
MODEL_FILE
,INDEX_FILE
,MODEL_FILE_PREFIX
).