which can happen when failing to obtain the guest's migration lock.
This led to a lot of mails being sent during migration (timeout for
obtaining lock is only 2 seconds and we run it in a loop).
One could argue that obtaining the lock should increase the fail
count, but without the lock, the job state should not be touched and
even the first three mails upon migration could be considered spam.
Fixes: e6b8af20 ("replication: sent always mail for first three tries and move helper")
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
my sub _should_mail_at_failcount {
my ($fail_count) = @_;
+ # avoid spam during migration (bug #4111): when failing to obtain the guest's migration lock,
+ # fail_count will be 0
+ return 0 if $fail_count == 0;
+
return 1 if $fail_count <= 3; # always send the first few for better visibility of the issue
# failing job is re-tried every half hour, try to send one mail after 1, 2, 4, 8, etc. days