On 10/5/23 17:08, Mancini, Riccardo wrote:
Hi, when a 4.14 guest runs on a 5.10 host (and later), it cannot use APF (despite CPUID advertising KVM_FEATURE_ASYNC_PF) due to the new interrupt-based mechanism 2635b5c4a0 (KVM: x86: interrupt based APF 'page ready' event delivery). Kernels after 5.9 won't satisfy the guest request to enable APF through KVM_ASYNC_PF_ENABLED, requiring also KVM_ASYNC_PF_DELIVERY_AS_INT to be set. Furthermore, the patch set seems to be dropping parts of the legacy #PF handling as well. I consider this as a bug as it breaks APF compatibility for older guests running on newer kernels, by breaking the underlying ABI. What do you think? Was this a deliberate decision?
Yes, this is intentional. It is not a breakage because the APF interface only tells how asynchronous page faults are delivered; it doesn't promise that they are actually delivered. However, I admit that the change was unfortunate.
Apart from the concerns about reentrancy, there were two more issues with the old API:
- the page-ready notification lacked an acknowledge mechanism if many pages became ready at the same time (see commit 557a961abbe0, "KVM: x86: acknowledgment mechanism for async pf page ready notifications"). This delayed the notifications of pages after the first. The new API uses MSR_KVM_ASYNC_PF_ACK to fix the problem.
- the old API confused synchronous events (exceptions) with asynchronous events (interrupts); this created a unique case where a page fault was generated on a page that is not accessed by the instruction. (The new API only fixes half of this, because it also has a bogus CR2, but it's a bit better). It also meant that page-ready events were suppressed by disabled interrupts---but they were not necessarily injected when IF became 1, because KVM did not enable the interrupt window. This is solved automatically by just injecting an interrupt. On the theoretical side, it's also just ugly that page-ready events could only be enabled/disabled with CLI/STI and not APIC (TPR).
Was this already reported in the past (I couldn't find anything in the mailing list but I might have missed it!)? Would it be much effort to support the legacy #PF based mechanism for older guests that choose to only set KVM_ASYNC_PF_ENABLED?
It is not hard. However, I don't think we should accept such a patch upstream.
I do have a question for you. Can you describe the context in which you are using APF, and would you be interested in ARM support? We (Red Hat, not me the maintainer :)) have been trying to understand for a long time if cloud providers use or need APF.
Paolo
The reason this is an issue for us now is that not having APF for older guests introduces a significant performance regression on 4.14 guests when paired to uffd handling of "remote" page-faults (similar to a live migration scenario) when we update from a 4.14 host kernel to a 5.10 host kernel.