Re: [PATCH] pci: Avoid FLR for AMD FCH AHCI adapters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 30, 2023 at 09:21:11AM -0600, Bjorn Helgaas wrote:
> [+cc Mario, Shyam, Brijesh]
> 
> On Sat, Jan 28, 2023 at 10:39:51AM +0900, Damien Le Moal wrote:
> > PCI passthrough to VMs does not work with AMD FCH AHCI adapters: the
> > guest OS fails to correctly probe devices attached to the controller due
> > to FIS communication failures. 
> 
> What does a FIS communication failure look like?  Can we include a
> line or two of dmesg output here to help users find this fix?

Hello Bjorn,

It looks like this:

[   22.402368] ata4: softreset failed (1st FIS failed)
[   32.417855] ata4: softreset failed (1st FIS failed)
[   67.441641] ata4: softreset failed (1st FIS failed)
[   67.453227] ata4: limiting SATA link speed to 3.0 Gbps
[   72.661738] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[   78.121263] ata4.00: qc timeout after 5000 msecs (cmd 0xec)
[   78.134413] ata4.00: failed to IDENTIFY (I/O error, err_mask=0x4)

Basically, we can read and write MMIO registers in the AHCI HBA,
but the communication between the AHCI HBA and the ATA device does
not work properly.

(Because the AHCI HBA did not get reset when binding/unbinding the
device.)

The exact same kernel, using the same generic AHCI driver within the VM,
can communicate perfectly fine when using e.g. an Intel AHCI HBA.

(With both the AMD and Intel AHCI HBAs being bound to the vfio-pci driver
in the host.)

We can send a v2 with the above dmesg output.


Kind regards,
Niklas

> 
> AMD folks: Can you confirm/deny that this is a hardware erratum in
> this device?  Do you know of any other devices that need a similar
> workaround?
> 
> > Forcing the "bus" reset method before
> > unbinding & binding the adapter to the vfio-pci driver solves this
> > issue. I.e.:
> > 
> > echo "bus" > /sys/bus/pci/devices/<ID>/reset_method
> > 
> > gives a working guest OS, thus indicating that the default flr reset
> > method is defective, resulting in the adapter not being reset correctly.
> > 
> > This patch applies the no_flr quirk to AMD FCH AHCI devices to
> > permanently solve this issue.
> > 
> > Reported-by: Niklas Cassel <niklas.cassel@xxxxxxx>
> > Cc: stable@xxxxxxxxxxxxxxx
> > Signed-off-by: Damien Le Moal <damien.lemoal@xxxxxxxxxxxxxxxxxx>
> > ---
> >  drivers/pci/quirks.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index 285acc4aaccc..20ac67d59034 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -5340,6 +5340,7 @@ static void quirk_no_flr(struct pci_dev *dev)
> >  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x1487, quirk_no_flr);
> >  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x148c, quirk_no_flr);
> >  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x149c, quirk_no_flr);
> > +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x7901, quirk_no_flr);
> >  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1502, quirk_no_flr);
> >  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1503, quirk_no_flr);
> >  
> > -- 
> > 2.39.1
> > 



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux