The mcp251xfd driver uses a TX FIFO for sending CAN frames and a TX Event FIFO
(TEF) for completed TX-requests.
The TEF event handling in the mcp251xfd_handle_tefif() function has a race
condition. It first increments the tx-ring's tail counter to signal that
there's room in the TX and TEF FIFO, then it increments the TEF FIFO in
hardware.
A running mcp251xfd_start_xmit() on a different CPU might not stop the txqueue
(as the tx-ring still shows free space). The next mcp251xfd_start_xmit() will
push a message into the chip and the TX complete event might overflow the TEF
FIFO.
This patch changes the order to fix the problem.
Fixes: 68c0c1c7f966 ("can: mcp251xfd: tef-path: reduce number of SPI core requests to set UINC bit")
Link: https://lore.kernel.org/r/20210105214138.3150886-2-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
struct mcp251xfd_tx_ring *tx_ring = priv->tx;
struct spi_transfer *last_xfer;
- tx_ring->tail += len;
-
/* Increment the TEF FIFO tail pointer 'len' times in
* a single SPI message.
- */
-
- /* Note:
+ *
+ * Note:
*
* "cs_change == 1" on the last transfer results in an
* active chip select after the complete SPI
if (err)
return err;
+ tx_ring->tail += len;
+
err = mcp251xfd_check_tef_tail(priv);
if (err)
return err;