Thanks, Vitaly, Paolo for your replies! I'll reply just to this message to avoid branching the conversation too much. > -----Original Message----- > From: Paolo Bonzini <pbonzini@xxxxxxxxxx> > Sent: 05 October 2023 17:15 > To: Mancini, Riccardo <mancio@xxxxxxxxxx>; vkuznets@xxxxxxxxxx > Cc: kvm@xxxxxxxxxxxxxxx; Graf (AWS), Alexander <graf@xxxxxxxxx>; Teragni, > Matias <mteragni@xxxxxxxxxx>; Batalov, Eugene <bataloe@xxxxxxxxxx> > Subject: RE: [EXTERNAL] Bug? Incompatible APF for 4.14 guest on 5.10 and > later host > > > > On 10/5/23 17:08, Mancini, Riccardo wrote: > > Hi, > > > > when a 4.14 guest runs on a 5.10 host (and later), it cannot use APF > > (despite CPUID advertising KVM_FEATURE_ASYNC_PF) due to the new > > interrupt-based mechanism 2635b5c4a0 (KVM: x86: interrupt based APF > 'page ready' event delivery). > > Kernels after 5.9 won't satisfy the guest request to enable APF > > through KVM_ASYNC_PF_ENABLED, requiring also > KVM_ASYNC_PF_DELIVERY_AS_INT to be set. > > Furthermore, the patch set seems to be dropping parts of the legacy > > #PF handling as well. > > I consider this as a bug as it breaks APF compatibility for older > > guests running on newer kernels, by breaking the underlying ABI. > > What do you think? Was this a deliberate decision? > > Yes, this is intentional. It is not a breakage because the APF interface > only tells how asynchronous page faults are delivered; it doesn't promise > that they are actually delivered. However, I admit that the change was > unfortunate. :( Makes sense, thanks for the explanation. > > Apart from the concerns about reentrancy, there were two more issues with > the old API: > > - the page-ready notification lacked an acknowledge mechanism if many > pages became ready at the same time (see commit 557a961abbe0, "KVM: x86: > acknowledgment mechanism for async pf page ready notifications"). This > delayed the notifications of pages after the first. The new API uses > MSR_KVM_ASYNC_PF_ACK to fix the problem. > > - the old API confused synchronous events (exceptions) with asynchronous > events (interrupts); this created a unique case where a page fault was > generated on a page that is not accessed by the instruction. (The new API > only fixes half of this, because it also has a bogus CR2, but it's a bit > better). It also meant that page-ready events were suppressed by disabled > interrupts---but they were not necessarily injected when IF became 1, > because KVM did not enable the interrupt window. This is solved > automatically by just injecting an interrupt. On the theoretical side, > it's also just ugly that page-ready events could only be enabled/disabled > with CLI/STI and not APIC (TPR). > > > Was this already reported in the past (I couldn't find anything in the > > mailing list but I might have missed it!)? > > Would it be much effort to support the legacy #PF based mechanism for > > older guests that choose to only set KVM_ASYNC_PF_ENABLED? > > It is not hard. However, I don't think we should accept such a patch > upstream. Regarding also Vitaly comment about backporting the changes to 4.14, I think supporting both modes in 5.10 (at least) might be the least effort path (fewer changes), at least to my naive untrained eye. I tried to playing around by partially reverting some of the changes to handle both cases but only got kernel panics in the guest so far, so I might be missing something. However, I have absolutely no experience with KVM code, so I wasn't expecting to get far in any case. > I do have a question for you. Can you describe the context in which you > are using APF, and would you be interested in ARM support? We (Red Hat, > not me the maintainer :)) have been trying to understand for a long time > if cloud providers use or need APF. Keeping it short, we resume "remote" VM snapshots so page faults might be very expensive on local cache misses. We have a few optimizations to work around some of the issues, but even on local hits there are still a lot of expensive page faults compared to a normal VM use-case, I believe. To be fair, I didn't even realise the benefits we were getting from APF until it actually broke :) It indeed plays a big role in keeping the resumption quick and efficient in our use-case. I didn't know that it wasn't available for ARM, as we don't use it at the moment, but that would be interesting for the future. Thanks, Riccardo > > Paolo > > > The reason this is an issue for us now is that not having APF for > > older guests introduces a significant performance regression on 4.14 > > guests when paired to uffd handling of "remote" page-faults (similar > > to a live migration scenario) when we update from a 4.14 host kernel to > a 5.10 host kernel.