Re: Bug? Incompatible APF for 4.14 guest on 5.10 and later host

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 10/6/23 03:24, Mancini, Riccardo wrote:
From: Paolo Bonzini <pbonzini@xxxxxxxxxx>
Sent: 05 October 2023 17:15

[...]

I do have a question for you.  Can you describe the context in which you
are using APF, and would you be interested in ARM support?  We (Red Hat,
not me the maintainer :)) have been trying to understand for a long time
if cloud providers use or need APF.

Keeping it short, we resume "remote" VM snapshots so page faults might
be very expensive on local cache misses. We have a few optimizations to work
around some of the issues, but even on local hits there are still a lot
of expensive page faults compared to a normal VM use-case, I believe.
To be fair, I didn't even realise the benefits we were getting from APF
until it actually broke :)
It indeed plays a big role in keeping the resumption quick and efficient
in our use-case.
I didn't know that it wasn't available for ARM, as we don't use it at
the moment, but that would be interesting for the future.


Adding Marc, Oliver and kvmarm@xxxxxxxxxxxxxxx

I tried to make the feature available to ARM64 long time ago, but the efforts
were discontinued as the significant concern was no users demanding for it [1].
It's definitely exciting news to know it's a important feature to AWS. I guess
it's probably another chance to re-evaluate the feature for ARM64?

[1] https://lore.kernel.org/kvmarm/87iloq2oke.wl-maz@xxxxxxxxxx/

Async PF needs two signals sent from host to guest, SDEI (Software Delegated
Exception Interface) is leveraged for that. So there were two series to support
SDEI virtualization [1] and Async PF on ARM64 [2].

[1] https://lore.kernel.org/kvmarm/20220527080253.1562538-1-gshan@xxxxxxxxxx/
[2] https://lore.kernel.org/kvmarm/20210815005947.83699-1-gshan@xxxxxxxxxx/

I got several questions for Mancini to answer, helpful understand the situation
better.

- VM shapshot is stored somewhere remotely. It means the page fault on
  instruction fetch becomes expensive. Do we have benchmarks how much
  benefits brought by Async PF on x86 in AWS environment?

- I'm wandering if the data can be fetched from somewhere remotely in AWS
  environment?

- The data can be stored in local DRAM or swapping space, the page fault
  to fetch data becomes expensive if the data is stored in swapping space.
  I'm not sure if it's possible the data resides in the swapping space in
  AWS environment? Note that the swapping space, corresponding to disk,
  could be somewhere remotely seated.

Thanks,
Gavin





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux