On Tue, Dec 05, 2023 at 04:22:33PM +0000, Catalin Marinas wrote: > Yeah, I made this argument in the past. But it's a fair question to ask > since the Arm world is different from x86. Just reusing an existing > driver in a different context may break its expectations. Does Normal NC > access complete by the time a TLBI (for Stage 2) and DSB (DVMsync) is > completed? It does reach some point of serialisation with subsequent > accesses to the same address but not sure how it is ordered with an > access to a different location like the config space used for reset. > Maybe it's not a problem at all or it is safe only for PCIe but it would > be good to get to the bottom of this. IMHO, the answer is you can't know architecturally. The specific vfio-platform driver must do an analysis of it's specific SOC and determine what exactly is required to order the reset. The primary purpose of the vfio-platform drivers is to provide this reset! In most cases I would expect some reads from the device to be required before the reset. > > Remember, the feedback we got from the CPU architects was that even > > DEVICE_* will experience an uncontained failure if the device tiggers > > an error response in shipping ARM IP. > > > > The reason PCIe is safe is because the PCI bridge does not generate > > errors in the first place! > > That's an argument to restrict this feature to PCIe. It's really about > fewer arguments on the behaviour of other devices. Marc did raise > another issue with the GIC VCPU interface (does this even have a vma in > the host VMM?). That's a class of devices where the mapping is > context-switched, so the TLBI+DSB rules don't help. I don't know anything about the GIC VCPU interface, to give any comment unfortunately. Since it seems there is something to fix here I would appreciate some background.. When you say it is context switched do you mean kvm does a register write on every vm entry to set the proper HW context for the vCPU? We are worrying that register write will possibly not order after NORMAL_NC? > > Thus, the way a platform device can actually be safe is if it too > > never generates errors in the first place! Obviously this approach > > works just as well with NORMAL_NC. > > > > If a platform device does generate errors then we shouldn't expect > > containment at all, and the memory type has no bearing on the > > safety. The correct answer is to block these platform devices from > > VFIO/KVM/etc because they can trigger uncontained failures. > > Assuming the error containment is sorted, there are two other issues > with other types of devices: > > 1. Ordering guarantees on reclaim or context switch Solved in VFIO > 2. Unaligned accesses > > On (2), I think PCIe is fairly clear on how the TLPs are generated, so I > wouldn't expect additional errors here. But I have no idea what AMBA/AXI > does here in general. Perhaps it's fine, I don't think we looked into it > as the focus was mostly on PCIe. I would expect AXI devices to throw errors in all sorts of odd cases. eg I would not be surprised at all to see carelessly built AXI devices error if ST64B is pointed at them. At least when I was building AXI logic years ago I was so lazy :P This is mostly my point - if the devices under vfio-platform were not designed to have contained errors then, IMHO, it is hard to believe that DEVICE_X/NORMAL_NC is the only issue in that HW. > So, I think it would be easier to get this patch upstream if we limit > the change to PCIe devices for now. We may relax this further in the > future. Do you actually have a need for non-PCIe devices to support WC > in the guest or it's more about the complexity of the logic to detect > whether it's actually a PCIe BAR we are mapping into the guest? (I can > see some Arm GPU folk asking for this but those devices are not easily > virtualisable). The complexity is my concern, and the disruption to the ecosystem with some of the ideas given. If there was a trivial way to convey in the VMA that it is safe then sure, no objection from me. My worry is this has turned from fixing a real problem we have today into a debate about theoretical issues that nobody may care about that are very disruptive to solve. I would turn it around and ask we find a way to restrict platform devices when someone comes with a platform device that wants to use secure kvm and has a single well defined HW problem that is solved by this work. What if we change vfio-pci to use pgprot_device() like it already really should and say the pgprot_noncached() is enforced as DEVICE_nGnRnE and pgprot_device() may be DEVICE_nGnRE or NORMAL_NC? Would that be acceptable? Jason