"Mancini, Riccardo" <mancio@xxxxxxxxxx> writes: > Hi, > > when a 4.14 guest runs on a 5.10 host (and later), it cannot use APF (despite > CPUID advertising KVM_FEATURE_ASYNC_PF) due to the new interrupt-based > mechanism 2635b5c4a0 (KVM: x86: interrupt based APF 'page ready' event delivery). > Kernels after 5.9 won't satisfy the guest request to enable APF through > KVM_ASYNC_PF_ENABLED, requiring also KVM_ASYNC_PF_DELIVERY_AS_INT to be set. > Furthermore, the patch set seems to be dropping parts of the legacy #PF handling > as well. > I consider this as a bug as it breaks APF compatibility for older guests running > on newer kernels, by breaking the underlying ABI. > What do you think? Was this a deliberate decision? It was. #PF based "page ready" injection was found to be fragile as in some cases it can collide with an actual #PF and nothing good is expected if this ever happens. I don't think we've actually broken the ABI as "asynchronous page fault" was always a "best effort" service: the guest indicates its readiness to process 'page missing' events but the host is under no obligation to actually send such notifications. > Was this already reported in the past (I couldn't find anything in the mailing list > but I might have missed it!)? I think it was Andy Lutomirski who started the discussion, see e.g. https://lore.kernel.org/lkml/ed71d0967113a35f670a9625a058b8e6e0b2f104.1583547991.git.luto@xxxxxxxxxx/ the patch is about KVM_ASYNC_PF_SEND_ALWAYS but if you go down the discussion you'll find more concerns expressed. > Would it be much effort to support the legacy #PF based mechanism for older > guests that choose to only set KVM_ASYNC_PF_ENABLED? Personally, I wouldn't go down this road: #PF injection at random time (for page-ready events) is still considered being fragile. > > The reason this is an issue for us now is that not having APF for older guests > introduces a significant performance regression on 4.14 guests when paired to > uffd handling of "remote" page-faults (similar to a live migration scenario) > when we update from a 4.14 host kernel to a 5.10 host kernel. What about backporting interrupt-based APF mechanism to older guests? -- Vitaly