On Fri, Nov 3, 2023 at 3:03 AM Mi, Dapeng <dapeng1.mi@xxxxxxxxxxxxxxx> wrote: > > > On 11/3/2023 1:45 AM, Jim Mattson wrote: > > On Wed, Nov 1, 2023 at 7:07 PM Mi, Dapeng <dapeng1.mi@xxxxxxxxxxxxxxx> wrote: > >> > >> On 11/1/2023 9:33 PM, Liang, Kan wrote: > >>> On 2023-10-31 11:31 p.m., Mi, Dapeng wrote: > >>>> On 11/1/2023 11:04 AM, Jim Mattson wrote: > >>>>> On Tue, Oct 31, 2023 at 6:59 PM Mi, Dapeng > >>>>> <dapeng1.mi@xxxxxxxxxxxxxxx> wrote: > >>>>>> On 11/1/2023 2:22 AM, Jim Mattson wrote: > >>>>>>> On Tue, Oct 31, 2023 at 1:58 AM Dapeng Mi > >>>>>>> <dapeng1.mi@xxxxxxxxxxxxxxx> wrote: > >>>>>>>> This patch adds support for the architectural topdown slots event > >>>>>>>> which > >>>>>>>> is hinted by CPUID.0AH.EBX. > >>>>>>> Can't a guest already program an event selector to count event select > >>>>>>> 0xa4, unit mask 1, unless the event is prohibited by > >>>>>>> KVM_SET_PMU_EVENT_FILTER? > >>>>>> Actually defining this new slots arch event is to do the sanity check > >>>>>> for supported arch-events which is enumerated by CPUID.0AH.EBX. > >>>>>> Currently vPMU would check if the arch event from guest is supported by > >>>>>> KVM. If not, it would be rejected just like intel_hw_event_available() > >>>>>> shows. > >>>>>> > >>>>>> If we don't add the slots event in the intel_arch_events[] array, guest > >>>>>> may program the slots event and pass the sanity check of KVM on a > >>>>>> platform which actually doesn't support slots event and program the > >>>>>> event on a real GP counter and got an invalid count. This is not > >>>>>> correct. > >>>>> On physical hardware, it is possible to program a GP counter with the > >>>>> event selector and unit mask of the slots event whether or not the > >>>>> platform supports it. Isn't KVM wrong to disallow something that a > >>>>> physical CPU allows? > >>>> Yeah, I agree. But I'm not sure if this is a flaw on PMU driver. If an > >>>> event is not supported by the hardware, we can't predict the PMU's > >>>> behavior and a meaningless count may be returned and this could mislead > >>>> the user. > >>> The user can program any events on the GP counter. The perf doesn't > >>> limit it. For the unsupported event, 0 should be returned. Please keep > >>> in mind, the event list keeps updating. If the kernel checks for each > >>> event, it could be a disaster. I don't think it's a flaw. > >> > >> Thanks Kan, it would be ok as long as 0 is always returned for > >> unsupported events. IMO, it's a nice to have feature that KVM does this > >> sanity check for supported arch events, it won't break anything. > > The hardware PMU most assuredly does not return 0 for unsupported events. > > > > For example, if I use host perf to sample event selector 0xa4 unit > > mask 1 on a Broadwell host (406f1), I get... > > > > # perf stat -e r01a4 sleep 10 > > > > Performance counter stats for 'sleep 10': > > > > 386,964 r01a4 > > > > 10.000907211 seconds time elapsed > > > > Broadwell does not advertise support for architectural event 7 in > > CPUID.0AH:EBX, so KVM will refuse to measure this event inside a > > guest. That seems broken to me. > > > Yeah, I also saw similar results on Coffee Lake which doesn't support > slots events and the return count seems to be a random and meaningless > value. If so, this meaningless value may mislead the guest perf user. > From this point view it looks the sanity check in KVM is useful, but it > indeed leads to different behavior between guest and host. Calling this a "sanity check" is somewhat specious, since the guards are based on the guest CPUID rather than the host CPUID. There is nothing to prevent the hypervisor from setting guest CPUID.0AH:EAX[31:24] to 32 and guest CPUID.0AH:EBX to 0, making 32 architectural events available to the guest. If it were really a sanity check, it would prevent the guest from using architectural events that are not supported by the host. Moreover, I'm pretty sure that a "sanity check" would only apply to top-down slots. Have there been any physical CPUs to date that support architectural events, but do not support all of the first seven architectural events? > I'm neutral on either keeping or removing this check. How's other > reviewers' opinion on this? > > > > > >>> Thanks, > >>> Kan > >>>> Add Kan to confirm this. > >>>> > >>>> Hi Kan, > >>>> > >>>> Have you any comments on this? Thanks. > >>>> > >>>> > >>>>>>> AFAICT, this change just enables event filtering based on > >>>>>>> CPUID.0AH:EBX[bit 7] (though it's not clear to me why two independent > >>>>>>> mechanisms are necessary for event filtering). > >>>>>> IMO, these are two different things. this change is just to enable the > >>>>>> supported arch events check for slot events, the event filtering is > >>>>>> another thing. > >>>>> How is clearing CPUID.0AH:EBX[bit 7] any different from putting {event > >>>>> select 0xa4, unit mask 1} in a deny list with the PMU event filter? > >>>> I think there is no difference in the conclusion but with two different > >>>> methods. > >>>> > >>>>