Andiry's xHCI bus suspend patch introduced the possibly of a host
controller replaying old commands on the command ring, if the host
successfully restores the registers after a resume.
After a resume from suspend, the xHCI driver must restore the registers,
including the command ring pointer. I had suggested that Andiry set the
command ring pointer to the current command ring dequeue pointer, so that
the driver wouldn't have to zero the command ring.
Unfortunately, setting the command ring pointer to the current dequeue
pointer won't work because the register assumes the pointer is 64-byte
aligned, and TRBs on the command ring are 16-byte aligned. The lower
seven bits will always be masked off, leading to the written pointer being
up to 3 TRBs behind the intended pointer.
Here's a log excerpt. On init, the xHCI driver places a vendor-specific
command on the command ring:
[ 215.750958] xhci_hcd 0000:01:00.0: Vendor specific event TRB type = 48
[ 215.750960] xhci_hcd 0000:01:00.0: NEC firmware version 30.25
[ 215.750962] xhci_hcd 0000:01:00.0: Command ring deq = 0x3781e010 (DMA)
When we resume, the command ring dequeue pointer to be written should have
been 0x3781e010. Instead, it's 0x3781e000:
[ 235.557846] xhci_hcd 0000:01:00.0: // Setting command ring address to 0x3781e001
[ 235.557848] xhci_hcd 0000:01:00.0: `MEM_WRITE_DWORD(3'b000, 64'hffffc900100bc038, 64'h3781e001, 4'hf);
[ 235.557850] xhci_hcd 0000:01:00.0: `MEM_WRITE_DWORD(3'b000, 32'hffffc900100bc020, 32'h204, 4'hf);
[ 235.557866] usb usb9: root hub lost power or was reset
(I can't see the results of this bug because the xHCI restore always fails
on this box, and the xHCI driver re-allocates everything.)
The fix is to zero the command ring and put the software and hardware
enqueue and dequeue pointer back to the beginning of the ring. We do this
before the system suspends, to be paranoid and prevent the BIOS from
starting the host without clearing the command ring pointer, which might
cause the host to muck with stale memory. (The pointer isn't required to
be in the suspend power well, but it could be.) The command ring pointer
is set again after the host resumes.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Tested-by: Andiry Xu <andiry.xu@amd.com>