On Thu, Mar 04, 2021, Xu, Like wrote: > Hi Sean, > > Thanks for your detailed review on the patch set. > > On 2021/3/4 0:58, Sean Christopherson wrote: > > On Wed, Mar 03, 2021, Like Xu wrote: > > > @@ -348,10 +352,26 @@ static bool intel_pmu_handle_lbr_msrs_access(struct kvm_vcpu *vcpu, > > > return true; > > > } > > > +/* > > > + * Check if the requested depth values is supported > > > + * based on the bits [0:7] of the guest cpuid.1c.eax. > > > + */ > > > +static bool arch_lbr_depth_is_valid(struct kvm_vcpu *vcpu, u64 depth) > > > +{ > > > + struct kvm_cpuid_entry2 *best; > > > + > > > + best = kvm_find_cpuid_entry(vcpu, 0x1c, 0); > > > + if (best && depth && !(depth % 8)) > > This is still wrong, it fails to weed out depth > 64. > > How come ? The testcases depth = {65, 127, 128} get #GP as expected. @depth is a u64, throw in a number that is a multiple of 8 and >= 520, and the "(1ULL << (depth / 8 - 1))" will trigger undefined behavior due to shifting beyond the capacity of a ULL / u64. Adding the "< 64" check would also allow dropping the " & 0xff" since the check would ensure the shift doesn't go beyond bit 7. I'm not sure the cleverness is worth shaving a cycle, though. > > Not that this is a hot path, but it's probably worth double checking that the > > compiler generates simple code for "depth % 8", e.g. it can be "depth & 7)". > > Emm, the "%" operation is quite normal over kernel code. So is "&" :-) I was just pointing out that the compiler should optimize this, and it did. > if (best && depth && !(depth % 8)) > 10659: 48 85 c0 test rax,rax > 1065c: 74 c7 je 10625 <intel_pmu_set_msr+0x65> > 1065e: 4d 85 e4 test r12,r12 > 10661: 74 c2 je 10625 <intel_pmu_set_msr+0x65> > 10663: 41 f6 c4 07 test r12b,0x7 > 10667: 75 bc jne 10625 <intel_pmu_set_msr+0x65> > > It looks like the compiler does the right thing. > Do you see the room for optimization ? > > > > > > + return (best->eax & 0xff) & (1ULL << (depth / 8 - 1)); Actually, looking at this again, I would explicitly use BIT() instead of 1ULL (or BIT_ULL), since the shift must be 7 or less. > > > + > > > + return false; > > > +} > > > + >