On 8/9/2023 8:24 PM, Bjorn Helgaas wrote: > On Wed, Aug 09, 2023 at 04:14:06PM -0700, Patel, Nirmal wrote: >> On 8/9/2023 3:00 PM, Bjorn Helgaas wrote: >>> On Wed, Aug 09, 2023 at 05:14:54PM -0400, Nirmal Patel wrote: >>>> During domain reset process vmd_domain_reset() clears PCI >>>> configuration space of VMD root ports. But certain platform >>>> has observed following errors and failed to boot. >>>> ... >>>> DMAR: VT-d detected Invalidation Queue Error: Reason f >>>> DMAR: VT-d detected Invalidation Time-out Error: SID ffff >>>> DMAR: VT-d detected Invalidation Completion Error: SID ffff >>>> DMAR: QI HEAD: UNKNOWN qw0 = 0x0, qw1 = 0x0 >>>> DMAR: QI PRIOR: UNKNOWN qw0 = 0x0, qw1 = 0x0 >>>> DMAR: Invalidation Time-out Error (ITE) cleared >>>> >>>> The root cause is that memset_io() clears prefetchable memory base/limit >>>> registers and prefetchable base/limit 32 bits registers sequentially. >>>> This seems to be enabling prefetchable memory if the device disabled >>>> prefetchable memory originally. >>>> >>>> Here is an example (before memset_io()): >>>> >>>> PCI configuration space for 10000:00:00.0: >>>> 86 80 30 20 06 00 10 00 04 00 04 06 00 00 01 00 >>>> 00 00 00 00 00 00 00 00 00 01 01 00 00 00 00 20 >>>> 00 00 00 00 01 00 01 00 ff ff ff ff 75 05 00 00 >>>> ... >>>> >>>> So, prefetchable memory is ffffffff00000000-575000fffff, which is >>>> disabled. When memset_io() clears prefetchable base 32 bits register, >>>> the prefetchable memory becomes 0000000000000000-575000fffff, which is >>>> enabled and incorrect. >>> It's not clear to me how this window config causes the VT-d errors. >>> But empirically it seems to be related, and maybe that's enough. >>> >>>> Here is the quote from section 7.5.1.3.9 of PCI Express Base 6.0 spec: >>>> >>>> The Prefetchable Memory Limit register must be programmed to a smaller >>>> value than the Prefetchable Memory Base register if there is no >>>> prefetchable memory on the secondary side of the bridge. >>>> >>>> This is believed to be the reason for the failure and in addition the >>>> sequence of operation in vmd_domain_reset() is not following the PCIe >>>> specs. >>>> >>>> Disable the bridge window by executing a sequence of operations >>>> borrowed from pci_disable_bridge_window() and pci_setup_bridge_io(), >>>> that comply with the PCI specifications. >>>> >>>> Signed-off-by: Nirmal Patel <nirmal.patel@xxxxxxxxxxxxxxx> >>>> --- >>>> v3->v4: Following same operation as pci_setup_bridge_io. >>>> v2->v3: Add more information to commit description. >>>> v1->v2: Follow same chain of operation as pci_disable_bridge_window >>>> and update commit log. >>>> --- >>>> drivers/pci/controller/vmd.c | 17 +++++++++++++++-- >>>> 1 file changed, 15 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c >>>> index 769eedeb8802..ae5b4c1704e4 100644 >>>> --- a/drivers/pci/controller/vmd.c >>>> +++ b/drivers/pci/controller/vmd.c >>>> @@ -526,8 +526,21 @@ static void vmd_domain_reset(struct vmd_dev *vmd) >>>> PCI_CLASS_BRIDGE_PCI)) >>>> continue; >>>> >>>> - memset_io(base + PCI_IO_BASE, 0, >>>> - PCI_ROM_ADDRESS1 - PCI_IO_BASE); >>>> + /* Temporarily disable the I/O range before updating PCI_IO_BASE */ >>>> + writel(0x0000ffff, base + PCI_IO_BASE_UPPER16); >>>> + /* Update lower 16 bits of I/O base/limit */ >>>> + writew(0x00f0, base + PCI_IO_BASE); >>>> + /* Update upper 16 bits of I/O base/limit */ >>>> + writel(0, base + PCI_IO_BASE_UPPER16); >>>> + >>>> + /* MMIO Base/Limit */ >>>> + writel(0x0000fff0, base + PCI_MEMORY_BASE); >>>> + >>>> + /* Prefetchable MMIO Base/Limit */ >>>> + writel(0, base + PCI_PREF_LIMIT_UPPER32); >>>> + writel(0x0000fff0, base + PCI_PREF_MEMORY_BASE); >>>> + writel(0xffffffff, base + PCI_PREF_BASE_UPPER32); >>>> + writeb(0, base + PCI_CAPABILITY_LIST); >>> What's the purpose of this PCI_CAPABILITY_LIST write? I guess you >>> don't want to find PM, MSI, MSI-X, PCIe, etc. capabilities? >>> >>> It's been there since the v1 patch, but the commit log only mentions >>> disabling bridge windows. >> I added it since it was part of original memset_io range. However >> from your previous comment, I checked the lspci output for >> PCI_CAPABILITY_LIST with and without the change and it doesn't seem >> to make any difference. > Ah, I see. My guess is that was a mistake in the original memset_io() > because I don't see a reason to clear PCI_CAPABILITY_LIST. > > PCI_CAPABILITY_LIST is HwInit, so should be read-only in terms of > config accesses, and if lspci sees the same capability list before and > after writing a zero to it, it sounds like it *is* read-only. > > Bjorn Yes, I will remove it. Thanks.