Re: [PATCH v3 1/5] kvm: x86/pmu: Introduce masked events to the pmu event filter

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 4, 2022 at 4:26 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Sat, Jul 09, 2022, Aaron Lewis wrote:
> > diff --git a/arch/x86/include/asm/kvm-x86-pmu-ops.h b/arch/x86/include/asm/kvm-x86-pmu-ops.h
> > index fdfd8e06fee6..016713b583bf 100644
> > --- a/arch/x86/include/asm/kvm-x86-pmu-ops.h
> > +++ b/arch/x86/include/asm/kvm-x86-pmu-ops.h
> > @@ -24,6 +24,7 @@ KVM_X86_PMU_OP(set_msr)
> >  KVM_X86_PMU_OP(refresh)
> >  KVM_X86_PMU_OP(init)
> >  KVM_X86_PMU_OP(reset)
> > +KVM_X86_PMU_OP(get_event_mask)
> >  KVM_X86_PMU_OP_OPTIONAL(deliver_pmi)
> >  KVM_X86_PMU_OP_OPTIONAL(cleanup)
> >
> > diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> > index 21614807a2cb..2964f3f15fb5 100644
> > --- a/arch/x86/include/uapi/asm/kvm.h
> > +++ b/arch/x86/include/uapi/asm/kvm.h
> > @@ -522,6 +522,14 @@ struct kvm_pmu_event_filter {
> >  #define KVM_PMU_EVENT_ALLOW 0
> >  #define KVM_PMU_EVENT_DENY 1
> >
> > +#define KVM_PMU_EVENT_FLAG_MASKED_EVENTS (1u << 0)
>
> Is this a "flag" or a "type"?  The usage in code isn't really clear, e.g. there's
> both
>
>         if (filter->flags & KVM_PMU_EVENT_FLAG_MASKED_EVENTS)
>
> and
>
>         if (flag == KVM_PMU_EVENT_FLAG_MASKED_EVENTS)

That should be "flag & KVM_PMU_EVENT_FLAG_MASKED_EVENTS".  I'll fix.

>
> > +#define KVM_PMU_EVENT_ENCODE_MASKED_EVENT(select, mask, match, invert) \
> > +             (((select) & 0xfful) | (((select) & 0xf00ul) << 24) | \
> > +             (((mask) & 0xfful) << 24) | \
> > +             (((match) & 0xfful) << 8) | \
> > +             (((invert) & 0x1ul) << 23))
>
> Please add a comment showing the actual layout, this is extremely dense to parse.
> Alternatively, and probably my preference, would be to define this as a union.
> Or maybe both?  E.g. so that userspace can do:
>
>         filter[i].raw = KVM_PMU_EVENT_ENCODE_MASKED_EVENT(...);
>
>         struct kvm_pmu_event_filter_entry {
>                 union {
>                         __u64 raw;
>                         struct {
>                                 __u64 select_lo:8;
>                                 __u64 match:8;
>                                 __u64 rsvd1:7;
>                                 __u64 invert:1;
>                                 __u64 mask:8;
>                                 __u64 select_hi:4;
>                                 __u64 rsvd2:28;
>                         };
>                 }
>         }

Agreed.  I'll add it.  Also, I like the idea of having both the union
and the helper.  They both seem useful.

>
> And that begs the question of whether or not KVM should check those "rsvd" fields.
> IIUC, this layout directly correlates to hardware, and hardware defines many of the
> "rsvd1" bits.  Are those just a don't care?

KVM really should care about the "rsvd" bits for filter events.  I
added a check in kvm_vm_ioctl_set_pmu_event_filter() for it and
documented it in the api docs.  If a "rsvd" bit is set, attempting to
set the filter will return -EINVAL.  This ensures that if we are on
Intel the 'select_hi' bits are clear.  This is important because it
allows us to use AMD's event select mask (AMD64_EVENTSEL_EVENT) in the
compare function when sorting or searching masked events on either
platform instead of having to special case the compare function.  I
also like validating the filter events so we have more predictability
from userspace's data when using it in the kernel.

I just noticed an issue: before doing the bsearch() the key needs to
apply the platform specific mask, i.e.
static_call(kvm_x86_pmu_get_event_mask)(), to ensure IN_TX (bit 32),
IN_TXCP (bit 33), and any other 'select_hi' bits Intel uses in the
future are clear.  I'll fix that in the follow-up.  I think we should
do that for both legacy and masked events, but I appreciate any
changes to the legacy mode will change the ABI, even if it is subtle,
so I can do that as an RFC.

> >
> > +static inline int cmp_safe64(u64 a, u64 b)
> > +{
> >       return (a > b) - (a < b);
> >  }
> >
> > +static int cmp_eventsel_event(const void *pa, const void *pb)
> > +{
> > +     return cmp_safe64(*(u64 *)pa & AMD64_EVENTSEL_EVENT,
> > +                       *(u64 *)pb & AMD64_EVENTSEL_EVENT);
>
> Why is common x86 code reference AMD64 masks?  If "AMD64" here just means the
> x86-64 / AMD64 architecture, i.e. is common to AMD and Intel, a comment to call
> that out would be helpful.

Explained above. I can add a comment.

>
> > +}
> > +
> > +static int cmp_u64(const void *pa, const void *pb)
> > +{
> > +     return cmp_safe64(*(u64 *)pa,
> > +                       *(u64 *)pb);
> > +}
> > +
> > +
> > +static bool is_filtered(struct kvm_pmu_event_filter *filter, u64 eventsel,
> > +                     bool invert)
> > +{
> > +     u64 key = get_event(eventsel);
> > +     u64 *event, *evt;
> > +
> > +     event = bsearch(&key, filter->events, filter->nevents, sizeof(u64),
> > +                     cmp_eventsel_event);
> > +
> > +     if (event) {
> > +             /* Walk the masked events backward looking for a match. */
>
> This isn't a very helpful comment.  It's obvious enough from the code that this
> walks backwards looking for a match, while the next one walks forward.  But it's
> not immediately obvious _why_ that's the behavior.  I eventually realized the code
> is looking for _any_ match, but a comment above this function documenting that
> would would be very helpful.
>
> And doesn't this approach yield somewhat arbitrary behavior?  If there are multiple
> filters for

I'm not sure I follow.  This should be deterministic.  An event is
either filtered or it's not, and while there are multiple cases that
would cause an event to not be filtered there is only one that will
cause it to be filtered.  That case is if a match exists in the filter
list while no corresponding inverted match exists.  Any other case
will not be filtered.  Those are: 1) if there is both a match and an
inverted match in the filter list. 2) there are no matching cases in
the filter list, inverted or non-inverted.  This second case is a
little more interesting because it can happen if no matches exist in
the filter list or if there is only an inverted match.  However,
either way we don't filter it, so it doesn't matter which one it is.

>
> > +             for (evt = event; evt >= filter->events &&
> > +                  get_event(*evt) == get_event(eventsel); evt--)
>
> > +static bool allowed_by_masked_events(struct kvm_pmu_event_filter *filter,
> > +                                  u64 eventsel)
> > +{
> > +     if (is_filtered(filter, eventsel, /*invert=*/false))
>
> Outer if-statement needs curly braces.  But even better would be to do:
>
>         if (is_filter(...) &&
>             !is_filter(...))
>                 return filter->action == KVM_PMU_EVENT_ALLOW;
>
> That said, I have zero idea what the intended logic is, i.e. this needs a verbose
> comment.   Maybe if I was actually familiar with PMU event magic it would be obvious,
> but I won't be the last person that reads this code with only a passing undertsanding
> of PMU events.
>
> Specifically, having an inverted flag _and_ an inverted bit in the event mask is
> confusing.

Maybe if I call it 'exclude' that would be better.  That does seem
like a more appropriate name.  I do think having this will be useful
at times, and in those moments it will allow for a more concise filter
list overall.  I'd really prefer to keep it.

Another option could be to remove KVM_PMU_EVENT_DENY.  I'm not sure
how useful it is.  Maybe there's a use case I'm not thinking of, but I
would think that the first step in building a deny list would be to
include all the undefined event selects + unit masks, which in legacy
mode would more than fill up the filter list.  That said, removing it
would be ABI breaking, so I thought I'd throw it out here.  Of course
there are other tricks to avoid breaking the ABI, but that would leave
the code a bit messier.  Thoughts?

>
> > +             if (!is_filtered(filter, eventsel, /*invert=*/true))
> > +                     return filter->action == KVM_PMU_EVENT_ALLOW;
> > +
> > +     return filter->action == KVM_PMU_EVENT_DENY;
> > +}
> > +



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux