On Wed, Feb 02, 2022, Peter Zijlstra wrote: > On Mon, Jan 17, 2022 at 01:37:22PM +0800, Li RongQing wrote: > > After support paravirtualized TLB shootdowns, steal_time.preempted > > includes not only KVM_VCPU_PREEMPTED, but also KVM_VCPU_FLUSH_TLB > > > > and kvm_vcpu_is_preempted should test only with KVM_VCPU_PREEMPTED > > This still fails to actually explain what the problem is, why did you > write this patch? Ya, definitely is lacking details. I think this captures everything... Tweak the assembly code for detecting KVM_VCPU_PREEMPTED to future proof it against new features, code refactorings, and theoretical compiler behavior. Explicitly test only the KVM_VCPU_PREEMPTED flag; steal_time.preempted has already been overloaded once for KVM_VCPU_FLUSH_TLB, and checking the entire byte for a non-zero value could yield a false positive. This currently isn't problematic as PREEMPTED and FLUSH_TLB are mutually exclusive, but that may not hold true for future flags. Use AND instead of TEST for querying PREEMPTED to clear RAX[63:8] before returning to avoid a potential false postive in the caller due to leaving the address (non-zero value) in the upper bits. Compilers are technically allowed to use more than a byte for storing _Bool, and it would be all too easy for someone to refactor the return type to something larger. Keep the SETcc (but change it to setnz for sanity's sake) as the fact that KVM_VCPU_PREEMPTED happens to be bit 0, i.e. the AND will yield 0/1 as needed for _Bool, is pure coincidence. > > Co-developed-by: Sean Christopherson <seanjc@xxxxxxxxxx> > > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> > > Signed-off-by: Li RongQing <lirongqing@xxxxxxxxx> > > --- > > diff with v2: using andl and setnz > > diff with v1: clear 64bit rax > > > > arch/x86/kernel/kvm.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c > > index b061d17..fe0aead 100644 > > --- a/arch/x86/kernel/kvm.c > > +++ b/arch/x86/kernel/kvm.c > > @@ -1025,8 +1025,9 @@ asm( > > ".type __raw_callee_save___kvm_vcpu_is_preempted, @function;" > > "__raw_callee_save___kvm_vcpu_is_preempted:" > > "movq __per_cpu_offset(,%rdi,8), %rax;" > > -"cmpb $0, " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax);" > > -"setne %al;" > > +"movb " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax), %al;" > > +"andl $" __stringify(KVM_VCPU_PREEMPTED) ", %eax;" > > +"setnz %al;" > > Isn't the below the simpler way of writing that same? The AND is needed to clear RAX[63:8], e.g. to avoid breakage if the return value is changed from a bool to something larger, or the compiler uses more than a byte for _Bool. > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c > index a438217cbfac..bc79adcf59ff 100644 > --- a/arch/x86/kernel/kvm.c > +++ b/arch/x86/kernel/kvm.c > @@ -1025,7 +1025,7 @@ asm( > ".type __raw_callee_save___kvm_vcpu_is_preempted, @function;" > "__raw_callee_save___kvm_vcpu_is_preempted:" > "movq __per_cpu_offset(,%rdi,8), %rax;" > -"cmpb $0, " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax);" > +"testb $" __stringify(KVM_VCPU_PREEMPTED) ", " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax);" > "setne %al;" > "ret;" > ".size __raw_callee_save___kvm_vcpu_is_preempted, .-__raw_callee_save___kvm_vcpu_is_preempted;"