Re: [PATCH v3 1/5] kvm: x86/pmu: Introduce masked events to the pmu event filter

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 2, 2022 at 5:19 PM Jim Mattson <jmattson@xxxxxxxxxx> wrote:
>
> On Fri, Jul 8, 2022 at 6:17 PM Aaron Lewis <aaronlewis@xxxxxxxxxx> wrote:
> >
> > This feature is enabled by setting the flags field to
> > KVM_PMU_EVENT_FLAG_MASKED_EVENTS.
> >
> > Events can be encoded by using KVM_PMU_EVENT_ENCODE_MASKED_EVENT().
> >
> > It is an error to have a bit set outside valid encoded bits, and calls
> > to KVM_SET_PMU_EVENT_FILTER will return -EINVAL in such cases,
> > including bits that are set in the high nybble[1] for AMD if called on
> > Intel.
> >
> > [1] bits 35:32 in the event and bits 11:8 in the eventsel.
>
> I think there is some confusion in the documentation and the code
> regarding what an 'eventsel' is. Yes, Intel started it, with
> "performance event select registers" that contain an "event select"
> field, but the 64-bit object you refer to as an 'event' here in the
> commit message is typically referred to as an 'eventsel' in the code

Yeah, it does look like 'eventsel' is more commonly used in the kernel
to refer to the "performance event select register".  I do see a few
locations where 'eventsel' refers to an eventsel's event.  Maybe I can
rename those in a separate series to avoid future confusion, though
another reason I named it that way is because the SDM and APM both
refer to the eventsel's event as an "event select" which naturally
makes sense to call it eventsel.  I figured we'd want to be consistent
with the docs.

If we prefer to call the 64-bit object 'eventsel', then what do we
call the eventsel's event?  Maybe 'eventsel's event' is good enough,
however, it's unfortunate "event select" is overloaded like it is
because I don't feel the need to say "eventsel's unit mask".

I'm open to other suggestions as well.  There's got to be something
better than that.

> below. Maybe it's too late to avoid confusion, but I'd suggest
> referring to the 64-bit object as a "PMU event filter entry," or
> something like that.
>

What about "filtered event"?

> > +}
> >
> > +static bool allowed_by_masked_events(struct kvm_pmu_event_filter *filter,
> > +                                    u64 eventsel)
> > +{
> > +       if (is_filtered(filter, eventsel, /*invert=*/false))
> > +               if (!is_filtered(filter, eventsel, /*invert=*/true))
>
> Perhaps you could eliminate the ugly parameter comments if you
> maintained the "normal" and inverted entries in separate lists. It
> might also speed things up for the common case, assuming that inverted
> entries are uncommon.

Is it really that ugly?  I thought it made it more clear, so you don't
have to jump to the function to see what the bool does.

I can see an argument for walking a shorter list for inverted entries
in the common case.

To do this I'll likely make an internal copy of the struct like
'struct kvm_x86_msr_filter' to avoid the flexible array at the end of
the pmu event filter, and to not mess with the current ABI.  I'll just
have two lists at the end: one for regular entries and one for
inverted entries.  If there's a better approach I'm all ears.

>
> > +                       return filter->action == KVM_PMU_EVENT_ALLOW;
> > +
> > +       return filter->action == KVM_PMU_EVENT_DENY;
> > +}
> > +
> > +static bool allowed_by_default_events(struct kvm_pmu_event_filter *filter,
> > +                                   u64 eventsel)
> > +{
> > +       u64 key = eventsel & AMD64_RAW_EVENT_MASK_NB;
> > +
> > +       if (bsearch(&key, filter->events, filter->nevents,
> > +                   sizeof(u64), cmp_u64))
> > +               return filter->action == KVM_PMU_EVENT_ALLOW;
> > +
> > +       return filter->action == KVM_PMU_EVENT_DENY;
> > +}
> > +
> >  void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel)
> >  {
> >         u64 config;
> > @@ -226,14 +318,11 @@ void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel)
> >
> >         filter = srcu_dereference(kvm->arch.pmu_event_filter, &kvm->srcu);
> >         if (filter) {
> > -               __u64 key = eventsel & AMD64_RAW_EVENT_MASK_NB;
> > -
> > -               if (bsearch(&key, filter->events, filter->nevents,
> > -                           sizeof(__u64), cmp_u64))
> > -                       allow_event = filter->action == KVM_PMU_EVENT_ALLOW;
> > -               else
> > -                       allow_event = filter->action == KVM_PMU_EVENT_DENY;
> > +               allow_event = (filter->flags & KVM_PMU_EVENT_FLAG_MASKED_EVENTS) ?
> > +                       allowed_by_masked_events(filter, eventsel) :
> > +                       allowed_by_default_events(filter, eventsel);
>
> If you converted all of the legacy filters into masked filters by
> simply setting the mask field to '0xff' when copying from userspace,
> you wouldn't need the complexity of two different matching algorithms.

Agreed that it will simplify the code, but it will make the legacy
case slower because instead of being able to directly search for a
filtered event, we will have to walk the list of matching eventsel's
events looking for a match.


> > @@ -603,10 +706,18 @@ int kvm_vm_ioctl_set_pmu_event_filter(struct kvm *kvm, void __user *argp)
> >         /* Ensure nevents can't be changed between the user copies. */
> >         *filter = tmp;
> >
> > +       r = -EINVAL;
> > +       /* To maintain backwards compatibility don't validate flags == 0. */
> > +       if (filter->flags != 0 && has_invalid_event(filter))
> > +               goto cleanup;
> > +
> > +       if (filter->flags & KVM_PMU_EVENT_FLAG_MASKED_EVENTS)
> > +               cmp = cmp_eventsel_event;
> > +
> >         /*
> >          * Sort the in-kernel list so that we can search it with bsearch.
> >          */
> > -       sort(&filter->events, filter->nevents, sizeof(__u64), cmp_u64, NULL);
> > +       sort(&filter->events, filter->nevents, sizeof(u64), cmp, NULL);
>
> I don't believe two different comparison functions are necessary. In
> the legacy case, when setting up the filter, you should be able to
> simply discard any filter entries that have extraneous bits set,
> because they will never match.

For masked events we need the list ordered by eventsel's event, to
ensure they are contiguous.  For the legacy case we just need the list
sorted in a way that allows us to quickly find a matching eventsel's
event + unit mask.  This lends itself well to a generic u64 sort, but
by doing this the eventsel's events will not be contiguous, which
doesn't lend itself to searching for a masked event.  To get this to
work for both cases I think we'd have to rearrange the sort to put the
bits for the eventsel's event above the unit mask when doing the
compare, which would slow the sort and compare down with all the bit
twiddling going on.

Or if the legacy case adopted how masked events work and ate the extra
cost for a walk rather than a direct search we could join them.

>
> >         mutex_lock(&kvm->lock);
> >         filter = rcu_replace_pointer(kvm->arch.pmu_event_filter, filter,



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux