8000 ModelCheckpoint should throw an exception/provide warning if non-existing metric is being monitored · Issue #21109 · keras-team/keras · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

ModelCheckpoint should throw an exception/provide warning if non-existing metric is being monitored #21109

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ladi-pomsar opened this issue Mar 30, 2025 · 2 comments

Comments

@ladi-pomsar
Copy link
ladi-pomsar commented Mar 30, 2025

Hi everyone,

I have been playing around with keras==3.9.1 and noticed one thing - if you make typo in the ModelCheckpoint monitored, you won't get "hard stop". There is a message, that is being logged, but it probably requires a bit more of a setup to get it shown in jupyter-notebook. What is even more interesting is that I have had CheckpointVerbosity to 1 and it still didn't show up.

So model situation:

Mine checkpoint was:

checkpoint = ModelCheckpoint(str(checkpoint_filepath), monitor='F1', mode='max', verbose = 1, save_best_only=True, save_weights_only=True)

but I already migrated from my custom F1 method to keras.metricsF1Score. Hence the name of the metric has changed:

Epoch 1/200
3/3 ━━━━━━━━━━━━━━━━━━━━ 6s 901ms/step - accuracy: 0.5128 - f1_score: 0.7160 - loss: 0.7154 - precision: 0.5620 - recall: 0.5987 - val_accuracy: 0.5385 - val_f1_score: 0.7000 - val_loss: 0.6556 - val_precision: 0.5385 - val_recall: 1.0000

Epoch 2/200
3/3 ━━━━━━━━━━━━━━━━━━━━ 2s 690ms/step - accuracy: 0.6355 - f1_score: 0.7259 - loss: 0.6447 - precision: 0.6251 - recall: 0.9000 - val_accuracy: 0.7308 - val_f1_score: 0.7000 - val_loss: 0.6110 - val_precision: 0.6667 - val_recall: 1.0000

Epoch 3/200
3/3 ━━━━━━━━━━━━━━━━━━━━ 2s 716ms/step - accuracy: 0.7948 - f1_score: 0.7353 - loss: 0.5620 - precision: 0.7538 - recall: 0.9599 - val_accuracy: 0.6923 - val_f1_score: 0.7000 - val_loss: 0.5420 - val_precision: 0.6875 - val_recall: 0.7857

I only learned about that once I have tried to load the weights and h5py have thrown:

`File h5py/h5f.pyx:102, in h5py.h5f.open()

FileNotFoundError: [Errno 2] Unable to synchronously open file (unable to open file: name = '/home/.../best-models/lstm.weights.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)`

as unsuprisingly, no weights were saved.

I am open to discussion, but I would either be in favour of the throwing a hard exception in case the metric doesn't exist or always showing this warning, unless it is explicitly turned of (verbose=-1?). I can imagine there are some advanced, hard metrics people with custom Checkpoints want to calculate only once in a while and this might break their setup.

@ladi-pomsar ladi-pomsar changed the title ModelCheckpoint should throw an exception if non-existing metric is being monitored ModelCheckpoint should throw an exception/provide warning if non-existing metric is being monitored Mar 30, 2025
@dhantule
Copy link
Contributor

Hi @ladi-pomsar, thanks for reporting this.
Could you provide some reproducible code?

Copy link
github-actions bot commented May 1, 2025

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale label May 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants
0