Excerpts from Aneesh Kumar K.V's message of January 19, 2021 7:45 pm: > Nicholas Piggin <npiggin@xxxxxxxxx> writes: > >> As explained in the comment, there is no need to flush TLBs on all >> threads in a core when a vcpu moves between threads in the same core. >> >> Thread migrations can be a significant proportion of vcpu migrations, >> so this can help reduce the TLB flushing and IPI traffic. >> >> Signed-off-by: Nicholas Piggin <npiggin@xxxxxxxxx> >> --- >> I believe we can do this and have the TLB coherency correct as per >> the architecture, but would appreciate someone else verifying my >> thinking. >> >> Thanks, >> Nick >> >> arch/powerpc/kvm/book3s_hv.c | 28 ++++++++++++++++++++++++++-- >> 1 file changed, 26 insertions(+), 2 deletions(-) >> >> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c >> index 752daf43f780..53d0cbfe5933 100644 >> --- a/arch/powerpc/kvm/book3s_hv.c >> +++ b/arch/powerpc/kvm/book3s_hv.c >> @@ -2584,7 +2584,7 @@ static void kvmppc_release_hwthread(int cpu) >> tpaca->kvm_hstate.kvm_split_mode = NULL; >> } >> >> -static void radix_flush_cpu(struct kvm *kvm, int cpu, struct kvm_vcpu *vcpu) >> +static void radix_flush_cpu(struct kvm *kvm, int cpu, bool core, struct kvm_vcpu *vcpu) >> { > > Can we rename 'core' to something like 'core_sched' or 'within_core' > >> struct kvm_nested_guest *nested = vcpu->arch.nested; >> cpumask_t *cpu_in_guest; >> @@ -2599,6 +2599,14 @@ static void radix_flush_cpu(struct kvm *kvm, int cpu, struct kvm_vcpu *vcpu) >> cpu_in_guest = &kvm->arch.cpu_in_guest; >> } >> > > and do > if (core_sched) { This function is called to flush guest TLBs on this cpu / all threads on this cpu core. I don't think it helps to bring any "why" logic into it because the caller has already dealt with that. Thanks, Nick > >> + if (!core) { >> + cpumask_set_cpu(cpu, need_tlb_flush); >> + smp_mb(); >> + if (cpumask_test_cpu(cpu, cpu_in_guest)) >> + smp_call_function_single(cpu, do_nothing, NULL, 1); >> + return; >> + } >> + >> cpu = cpu_first_thread_sibling(cpu); >> for (i = 0; i < threads_per_core; ++i) >> cpumask_set_cpu(cpu + i, need_tlb_flush); >> @@ -2655,7 +2663,23 @@ static void kvmppc_prepare_radix_vcpu(struct kvm_vcpu *vcpu, int pcpu) >> if (prev_cpu < 0) >> return; /* first run */ >> >> - radix_flush_cpu(kvm, prev_cpu, vcpu); >> + /* >> + * If changing cores, all threads on the old core should >> + * flush, because TLBs can be shared between threads. More >> + * precisely, the thread we previously ran on should be >> + * flushed, and the thread to first run a vcpu on the old >> + * core should flush, but we don't keep enough information >> + * around to track that, so we flush all. >> + * >> + * If changing threads in the same core, only the old thread >> + * need be flushed. >> + */ >> + if (cpu_first_thread_sibling(prev_cpu) != >> + cpu_first_thread_sibling(pcpu)) >> + radix_flush_cpu(kvm, prev_cpu, true, vcpu); >> + else >> + radix_flush_cpu(kvm, prev_cpu, false, vcpu); >> + >> } >> } >> >> -- >> 2.23.0 >