On Tue, Sep 20, 2022, Aaron Lewis wrote: > +To access individual components in a masked entry use: > +:: > + struct kvm_pmu_event_masked_entry { > + union { > + __u64 raw; > + struct { > + /* event_select bits 11:8 are only valid on AMD. */ > + __u64 event_select:12; > + __u64 mask:8; > + __u64 match:8; > + __u64 exclude:1; As suggested in patch 3, keep the architectural bits where they are and fill in the gaps. IIUC, all of bits 63:36 are available, i.e. there's lots of room for expansion. Go top down (start at 63) and cross our fingers that neither Intel nor AMD puts stuff there. If that does happen, then we can start mucking with the layout, but until then, let's not make it too hard for ourselves. > + __u64 rsvd:35; > #define KVM_VCPU_TSC_CTRL 0 /* control group for the timestamp counter (TSC) */ > #define KVM_VCPU_TSC_OFFSET 0 /* attribute for the TSC offset */ > diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c > index 7ce8bfafea91..b188ddb23f75 100644 > --- a/arch/x86/kvm/pmu.c > +++ b/arch/x86/kvm/pmu.c > @@ -252,34 +252,61 @@ static inline u8 get_unit_mask(u64 eventsel) > return (eventsel & ARCH_PERFMON_EVENTSEL_UMASK) >> 8; > } ... > +/* > + * Sort will order the list by exclude, then event select. This function will > + * then index the sublists of event selects such that when a search is done on > + * the list, the head of the event select sublist is returned. This simplifies > + * the logic in filter_contains_match() when walking the list. Unless I'm missing something, this is a complex way to solve a relatively simple problem. You want a list of "includes" and a list of "excludes". Just have two separate lists. Actually, if we're effectively changing the ABI, why not make everyone's lives simpler and expose that to userspace. E.g. use one of the "pad" words to specify the number of "include" events and effectively do this: struct kvm_pmu_event_filter { __u32 action; __u32 nevents; __u32 fixed_counter_bitmap; __u32 flags; __u32 nr_include_events; __u64 include[nr_include_events]; __u64 exclude[nevents - nr_allowed_events]; }; Then we don't need to steal a bit for "exclude" in the uABI. The kernel code gets a wee bit simpler. > @@ -693,6 +796,10 @@ int kvm_vm_ioctl_set_pmu_event_filter(struct kvm *kvm, void __user *argp) > /* Ensure nevents can't be changed between the user copies. */ > *filter = tmp; > > + r = -EINVAL; > + if (!is_filter_valid(filter)) Do this in prepare_filter_events() to avoid walking the filter multiple times. I've gotten fairly twisted around and my local copy of the code is a mess, but something like this: static int prepare_filter_events(struct kvm_pmu_event_filter *filter) { int i; for (i = 0, j = 0; i < filter->nevents; i++) { if (filter->events[i] & ~kvm_pmu_ops.EVENTSEL_MASK) { if (!filter->mask) continue; if (<reserved bits set>) return -EINVAL; } filter->events[j++] = filter->events[i]; } <more magic here> } > + goto cleanup; > + > prepare_filter_events(filter); > > mutex_lock(&kvm->lock);