Christoph Lameter (Ampere) <cl@xxxxxxxxx> writes: > On Wed, 22 Nov 2023, Mihai Carabas wrote: > >> La 22.11.2023 22:51, Christoph Lameter a scris: >>> On Mon, 20 Nov 2023, Mihai Carabas wrote: >>> >>>> cpu_relax on ARM64 does a simple "yield". Thus we replace it with >>>> smp_cond_load_relaxed which basically does a "wfe". >>> Well it clears events first (which requires the first WFE) and then does a >>> WFE waiting for any events if no events were pending. >>> WFE does not cause a VMEXIT? Or does the inner loop of >>> smp_cond_load_relaxed now do 2x VMEXITS? >>> KVM ARM64 code seems to indicate that WFE causes a VMEXIT. See >>> kvm_handle_wfx(). >> >> In KVM ARM64 the WFE traping is dynamic: it is enabled only if there are more >> tasks waiting on the same core (e.g. on an oversubscribed system). >> >> In arch/arm64/kvm/arm.c: >> >> 457 >-------if (single_task_running()) >> 458 >------->-------vcpu_clear_wfx_traps(vcpu); >> 459 >-------else >> 460 >------->-------vcpu_set_wfx_traps(vcpu); > > Ahh. Cool did not know about that. But still: Lots of VMEXITs once the load has > to be shared. Yeah, anytime there's more than one runnable process. Another, more critical place where we will vmexit is the qspinlock slowpath which uses smp_cond_load. >> This of course can be improved by having a knob where you can completly >> disable wfx traping by your needs, but I left this as another subject to >> tackle. Probably needs to be adaptive since we use WFE in error paths as well (for instance to park the CPU.) Ankur