]> git.proxmox.com Git - mirror_ubuntu-jammy-kernel.git/commit
net/mlx5: Fix deadlock in sync reset flow
authorMoshe Shemesh <moshe@nvidia.com>
Mon, 11 Apr 2022 18:31:06 +0000 (21:31 +0300)
committerStefan Bader <stefan.bader@canonical.com>
Wed, 22 Jun 2022 12:23:16 +0000 (14:23 +0200)
commitdd6cf49fa66a0fb875c4bf90d7d530a4f5926508
tree6b4c09b42e42e166334763b3887b114ba8f9f5f4
parentde29908674dce73fa5c334ee596eb3e487d6d71c
net/mlx5: Fix deadlock in sync reset flow

BugLink: https://bugs.launchpad.net/bugs/1978240
commit cb7786a76ea39f394f0a059787fe24fa8e340fb6 upstream.

The sync reset flow can lead to the following deadlock when
poll_sync_reset() is called by timer softirq and waiting on
del_timer_sync() for the same timer. Fix that by moving the part of the
flow that waits for the timer to reset_reload_work.

It fixes the following kernel Trace:
RIP: 0010:del_timer_sync+0x32/0x40
...
Call Trace:
 <IRQ>
 mlx5_sync_reset_clear_reset_requested+0x26/0x50 [mlx5_core]
 poll_sync_reset.cold+0x36/0x52 [mlx5_core]
 call_timer_fn+0x32/0x130
 __run_timers.part.0+0x180/0x280
 ? tick_sched_handle+0x33/0x60
 ? tick_sched_timer+0x3d/0x80
 ? ktime_get+0x3e/0xa0
 run_timer_softirq+0x2a/0x50
 __do_softirq+0xe1/0x2d6
 ? hrtimer_interrupt+0x136/0x220
 irq_exit+0xae/0xb0
 smp_apic_timer_interrupt+0x7b/0x140
 apic_timer_interrupt+0xf/0x20
 </IRQ>

Fixes: 3c5193a87b0f ("net/mlx5: Use del_timer_sync in fw reset flow of halting poll")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Maher Sanalla <msanalla@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c