On Thu, Dec 21, 2023 at 01:19:18PM +0000, Catalin Marinas wrote: [...] > > Apologies, I didn't mean to question what's going on here from the > > hardware POV. My concern was more from the kernel + user interfaces POV, > > this all seems to work (specifically for PCI) by maintaining an > > intentional mismatch between the VFIO stage-1 and KVM stage-2 mappings. > > If you stare at it long enough, the mismatch starts to look fine ;). > Even if you have the VFIO stage 1 Normal NC, KVM stage 2 Normal NC, you > can still have the guest setting stage 1 to Device and introduce an > architectural mismatch. These aliases have some bad reputation but the > behaviour is constrained architecturally. > > IMHO we should move on from this attribute mismatch since we can't fully > solve it anyway and focus instead on what the device, system can > tolerate, who's responsible for deciding which MMIO ranges can be mapped > as Normal NC. Fair enough :) The other slightly unsavory part is that we're baking the mapping policy into KVM. I'd prefer it if this policy were kept in userspace somehow, but there's no actual usecase for userspace selecting memory attributes at this point. > If we really want to avoid any aliases (though I think we are spending > too many cycles on something that's not a real issue), the only way is > to have fd-based mappings in KVM so that there's no VMM alias. After > that we need to choose between (2) and (3) since the VMM may no longer > be able to probe the device and figure out which ranges need what > attributes. These are the sorts of things I was more worried about. I completely agree that the patches are fine for relaxing the 'simple' PCIe use cases, I just don't want to establish the precedent that the kernel/KVM will be on the hook to work out more complex use cases that may require the composition of various mappings. But I'm happy to table that discussion until the usecase arises :) -- Thanks, Oliver