On Tue, Jan 17, 2023 at 05:50:05PM +0800, Huang Adrian wrote: > On Wed, Jan 11, 2023 at 11:58 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > > When memset_io() clears prefetchable base 32 bits register, the > > > prefetchable memory becomes 0000000000000000-575000fffff, which > > > is enabled. This behavior (accidental enablement of window) > > > causes that config accesses get routed to the wrong place, and > > > the access content of PCI configuration space of VMD root ports > > > is 0xff after invoking memset_io() in vmd_domain_reset(): > > > > I was thinking the problem was only between clearing > > PCI_PREF_MEMORY_BASE and PCI_PREF_BASE_UPPER32, but that would be > > a pretty small window, and you're seeing a lot of config accesses > > going wrong. Why is that? Is there enumeration that races with > > this domain reset? > > Well, I didn't see the races. The problem is that: memset_io() uses > enhanced REP STOSB, fast-string operation or legacy method (see > arch/x86/lib/memset_64.S) to *sequentially* clear the memory > location from lower memory location to higher one. Obviously we can't *ever* clear both PCI_PREF_MEMORY_BASE and PCI_PREF_BASE_UPPER32 atomically, whether it's via memset_io(), writel(), or whatever. I understand that. > When clearing at PCI_PREF_BASE_UPPER32, the prefetchable memory > window is accidentally enabled. The subsequent accesses (each read > returns 0xff, and each write does not take any effect) cannot be > made correctly. In this case, clearing at PCI_PREF_LIMIT_UPPER32 > cannot take any effect. So, we're unable to configure VMD devices > anymore for subsequent writes. I understand the mechanism that temporarily enables the window. But I don't understand the part about "clearing PCI_PREF_LIMIT_UPPER32 *cannot* take any effect." Is it impossible to clear PCI_PREF_LIMIT_UPPER32 while the window is enabled? Given the normal PCI rules, I don't understand why that would be. This sequence is problematic because the window is accidentally enabled: 1) clear PCI_PREF_MEMORY_BASE 2) <window is enabled here> 3) clear PCI_PREF_BASE_UPPER32 and the following sequence works as desired: clear PCI_PREF_BASE_UPPER32 clear PCI_PREF_MEMORY_BASE The interval between 1) and 3) above should be short: there are only a few config writes between them. But you're seeing DMAR VT-d config reads that fail. Why are those happening at the same time as VMD enumeration? And apparently you can also run lspci and see *those* config reads fail. There has to be more going on here than a window that is accidentally enabled for a few milliseconds. *That* is my question. Bjorn