Re: [kvm-unit-tests patch v6 04/18] x86: pmu: Fix the issue that pmu_counter_t.config crosses cache line

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Sep 14, 2024, Dapeng Mi wrote:
> When running pmu test on SPR, the following #GP fault is reported.
> 
> Unhandled exception 13 #GP at ip 000000000040771f
> error_code=0000      rflags=00010046      cs=00000008
> rax=00000000004031ad rcx=0000000000000186 rdx=0000000000000000 rbx=00000000005142f0
> rbp=0000000000514260 rsi=0000000000000020 rdi=0000000000000340
>  r8=0000000000513a65  r9=00000000000003f8 r10=000000000000000d r11=00000000ffffffff
> r12=000000000043003c r13=0000000000514450 r14=000000000000000b r15=0000000000000001
> cr0=0000000080010011 cr2=0000000000000000 cr3=0000000001007000 cr4=0000000000000020
> cr8=0000000000000000
>         STACK: @40771f 40040e 400976 400aef 40148d 401da9 4001ad
> FAIL pmu
> 
> It looks EVENTSEL0 MSR (0x186) is written a invalid value (0x4031ad) and
> cause a #GP.

Nope.

> Further investigation shows the #GP is caused by below code in
> __start_event().
> 
> rmsr(MSR_GP_EVENT_SELECTx(event_to_global_idx(evt)),
> 		  evt->config | EVNTSEL_EN);

Nope.

> The evt->config is correctly initialized but seems corrupted before
> writing to MSR.

Still nope.

> 
> The original pmu_counter_t layout looks as below.
> 
> typedef struct {
> 	uint32_t ctr;
> 	uint64_t config;
> 	uint64_t count;
> 	int idx;
> } pmu_counter_t;
> 
> Obviously the config filed crosses two cache lines. 

Yeah, no.  Cache lines are 64 bytes on x86, and even with the bad layout, the size
only adds up to 32 bytes on x86-64.  Packing it slightly better drops it to 24
bytes, but that has nothing to do with cache lines.
 
> When the two cache lines are not updated simultaneously, the config value is
> corrupted.

This is simply nonsensical.  Compilers don't generate accesses that split cache
lines unless software is being deliberately stupid, and x86 doesn't corrupt data
on unaligned accesses.

The actual problem is that your next patch increases the size of the array in
check_counters_many() from 10 to 48 entries.  With 32 bytes per entry, it's just
enough to overflow the stack when making function calls (the array itself stays
on the stack page).  And because KUT's percpu and stack management is complete
and utter garbage, overflowing the stack clobbers the percpu area.

Of course, it's way too hard to even see that, because all of the code is beyond
stupid and (a) doesn't align the stacks to 4KiB, and (b) puts the percpu area at
the bottom of the stack "page".

I'll send patches to put band-aids on the per-CPU insanity, along with a refreshed
version of this series.




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux