Re: [kvm-unit-tests PATCH] arm: pmu-overflow-interrupt: Increase count values

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 11/7/23 15:05, Alexandru Elisei wrote:
> On Tue, Nov 07, 2023 at 02:34:05PM +0100, Eric Auger wrote:
>> Hi Alexandru,
>>
>> On 11/7/23 10:52, Alexandru Elisei wrote:
>>> Hi Eric,
>>>
>>> On Fri, Nov 03, 2023 at 11:01:39AM +0100, Eric Auger wrote:
>>>> On some hardware, some pmu-overflow-interrupt failures can be observed.
>>>> Although the even counter overflows, the interrupt is not seen as
>>>> expected. This happens in the subtest after "promote to 64-b" comment.
>>>> After analysis, the PMU overflow interrupt actually hits, ie.
>>>> kvm_pmu_perf_overflow() gets called and KVM_REQ_IRQ_PENDING is set,
>>>> as expected. However the PMCR.E is reset by the handle_exit path, at
>>>> kvm_pmu_handle_pmcr() before the next guest entry and
>>>> kvm_pmu_flush_hwstate/kvm_pmu_update_state subsequent call.
>>>> There, since the enable bit has been reset, kvm_pmu_update_state() does
>>>> not inject the interrupt into the guest.
>>>>
>>>> This does not seem to be a KVM bug but rather an unfortunate
>>>> scenario where the test disables the PMCR.E too closely to the
>>>> advent of the overflow interrupt.
>>> If I understand correctly, the KVM PMU, after receiving the hardware PMUIRQ and
>>> before injecting the interrupt, checks that the PMU is enabled according to the
>>> pseudocode for the function CheckForPMUOverflow(). CheckForPMUOverflow() returns
>>> false because PMCR_EL1.E is 0, so the KVM PMU decides not to inject the
>>> interrupt.
>>>
>>> Is that correct?
>> Yes that's correct
>>> Changing the number of SW_INCR events might not be optimal - for example,
>>> COUNT_INT > 100 might hide an error that otherwise would have been triggered if
>>> the number of events were 100. Not very likely, but still a possibility.
>> I also changed the COUNT for SW_INCR events to unify the code. However
>> this is not strictly necessary to fix the issue I encounter. I can
>> revert that change if you prefer.
> I don't understand how that would solve the problem. As I see it, the problem is
> that PMCR_EL1.E is cleared too fast after the PMU asserts the interrupt on
> overflow, not the time it takes to get to the overflow condition (i.e, the
> number of iterations mem_access_loop() does).

sorry I did not make my point clear. Indeed wrt SW_INCR overflow testing
I do not intend to fix any issue by this change. I just intended to use
the same number of iterations as for mem_access. So I will revert that
change.
>
>>> Another approach would be to wait for a set amount of time for the CPU to take
>>> the interrupt. There's something similar in timer.c::{test_timer_tval(),
>>> timer_do_wfi()}.
>> you're right. However this would urge me to have a separate asm code
>> that loops with wfi after doing the mem_access loop. I am not sure this
>> is worth the candle here?
> I think plain C would work, I was thinking something like this:
>
> diff --git a/arm/pmu.c b/arm/pmu.c
> index a91a7b1fd4be..fb2eb5fa2e50 100644
> --- a/arm/pmu.c
> +++ b/arm/pmu.c
> @@ -979,6 +979,23 @@ static void test_overflow_interrupt(bool overflow_at_64bits)
>         /* interrupts are disabled (PMINTENSET_EL1 == 0) */
>
>         mem_access_loop(addr, 200, pmu.pmcr_ro | PMU_PMCR_E | pmcr_lp);
Currently PMCR_E is reset by mem_access_loop() (at msr pmcr_el0,
xzr\n"). so if we want to introduce a delay between the overflow
interrupt and the PMCR.E disable, we need to either add extra MEM_ACCESS
or do wfi within mem_access_loop()
Or we do something like what you suggest below and reset the PMCR_E
afterwards with the downside to add extra code execution accounted by
the PMU. I would prefer to avoid that since the purpose of having the
asm code was to "master" what we measure.
> +
> +       if (!expect_interrupts(0)) {
> +                for (i = 0; i < 10; i++) {
> +                       local_irq_disable();
> +                       if (expect_interrupts(0)) {
> +                               local_irq_enable();
> +                               break;
> +                       }
> +                       report_info("waiting for interrupt...");
> +                       wfi();
> +                       local_irq_enable();
> +                       if (expect_interrupts(0))
> +                               break;
> +                        mdelay(100);
> +                }
> +       }
> +
>         report(expect_interrupts(0), "no overflow interrupt after preset");
>
>         set_pmcr(pmu.pmcr_ro | PMU_PMCR_E | pmcr_lp);
>
> Can be cleaned up by moving it to separate function, etc. Has the downside that
> it may performs extra memory accesses in expect_interrupts(). Your choice.
>
> By the way, pmu_stats is not declared volatile, which means that the
> compiler is free to optimize accesses to the struct by caching previously
> read values in registers. Have you tried declaring it as volatile, in case
> that fixes the issues you were seeing?
In my case it won't fix the issue because the stats match what happens
but your suggestion makes total sense in general.

I will add that.

Eric
>
> If you do decide to go with the above suggestion, I strongly suggest
> pmu_stats is declared as volatile, otherwise the compiler will likely end
> up not reading from memory on every iteration.
>
> Thanks,
> Alex
>> Thanks!
>>
>> Eric
>>> Thanks,
>>> Alex
>>>
>>>> Since it looks like a benign and inlikely case, let's resize the number
>>>> of iterations to prevent the PMCR enable bit from being resetted
>>>> at the same time as the actual overflow event.
>>>>
>>>> COUNT_INT is introduced, arbitrarily set to 1000 iterations and is
>>>> used in this test.
>>>>
>>>> Reported-by: Jan Richter <jarichte@xxxxxxxxxx>
>>>> Signed-off-by: Eric Auger <eric.auger@xxxxxxxxxx>
>>>> ---
>>>>  arm/pmu.c | 15 ++++++++-------
>>>>  1 file changed, 8 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/arm/pmu.c b/arm/pmu.c
>>>> index a91a7b1f..acd88571 100644
>>>> --- a/arm/pmu.c
>>>> +++ b/arm/pmu.c
>>>> @@ -66,6 +66,7 @@
>>>>  #define PRE_OVERFLOW_64		0xFFFFFFFFFFFFFFF0ULL
>>>>  #define COUNT 250
>>>>  #define MARGIN 100
>>>> +#define COUNT_INT 1000
>>>>  /*
>>>>   * PRE_OVERFLOW2 is set so that 1st @COUNT iterations do not
>>>>   * produce 32b overflow and 2nd @COUNT iterations do. To accommodate
>>>> @@ -978,13 +979,13 @@ static void test_overflow_interrupt(bool overflow_at_64bits)
>>>>  
>>>>  	/* interrupts are disabled (PMINTENSET_EL1 == 0) */
>>>>  
>>>> -	mem_access_loop(addr, 200, pmu.pmcr_ro | PMU_PMCR_E | pmcr_lp);
>>>> +	mem_access_loop(addr, COUNT_INT, pmu.pmcr_ro | PMU_PMCR_E | pmcr_lp);
>>>>  	report(expect_interrupts(0), "no overflow interrupt after preset");
>>>>  
>>>>  	set_pmcr(pmu.pmcr_ro | PMU_PMCR_E | pmcr_lp);
>>>>  	isb();
>>>>  
>>>> -	for (i = 0; i < 100; i++)
>>>> +	for (i = 0; i < COUNT_INT; i++)
>>>>  		write_sysreg(0x2, pmswinc_el0);
>>>>  
>>>>  	isb();
>>>> @@ -1002,15 +1003,15 @@ static void test_overflow_interrupt(bool overflow_at_64bits)
>>>>  	write_sysreg(ALL_SET_32, pmintenset_el1);
>>>>  	isb();
>>>>  
>>>> -	mem_access_loop(addr, 200, pmu.pmcr_ro | PMU_PMCR_E | pmcr_lp);
>>>> +	mem_access_loop(addr, COUNT_INT, pmu.pmcr_ro | PMU_PMCR_E | pmcr_lp);
>>>>  
>>>>  	set_pmcr(pmu.pmcr_ro | PMU_PMCR_E | pmcr_lp);
>>>>  	isb();
>>>>  
>>>> -	for (i = 0; i < 100; i++)
>>>> +	for (i = 0; i < COUNT_INT; i++)
>>>>  		write_sysreg(0x3, pmswinc_el0);
>>>>  
>>>> -	mem_access_loop(addr, 200, pmu.pmcr_ro);
>>>> +	mem_access_loop(addr, COUNT_INT, pmu.pmcr_ro);
>>>>  	report_info("overflow=0x%lx", read_sysreg(pmovsclr_el0));
>>>>  	report(expect_interrupts(0x3),
>>>>  		"overflow interrupts expected on #0 and #1");
>>>> @@ -1029,7 +1030,7 @@ static void test_overflow_interrupt(bool overflow_at_64bits)
>>>>  	write_regn_el0(pmevtyper, 1, CHAIN | PMEVTYPER_EXCLUDE_EL0);
>>>>  	write_regn_el0(pmevcntr, 0, pre_overflow);
>>>>  	isb();
>>>> -	mem_access_loop(addr, 200, pmu.pmcr_ro | PMU_PMCR_E | pmcr_lp);
>>>> +	mem_access_loop(addr, COUNT_INT, pmu.pmcr_ro | PMU_PMCR_E | pmcr_lp);
>>>>  	report(expect_interrupts(0x1), "expect overflow interrupt");
>>>>  
>>>>  	/* overflow on odd counter */
>>>> @@ -1037,7 +1038,7 @@ static void test_overflow_interrupt(bool overflow_at_64bits)
>>>>  	write_regn_el0(pmevcntr, 0, pre_overflow);
>>>>  	write_regn_el0(pmevcntr, 1, all_set);
>>>>  	isb();
>>>> -	mem_access_loop(addr, 400, pmu.pmcr_ro | PMU_PMCR_E | pmcr_lp);
>>>> +	mem_access_loop(addr, COUNT_INT, pmu.pmcr_ro | PMU_PMCR_E | pmcr_lp);
>>>>  	if (overflow_at_64bits) {
>>>>  		report(expect_interrupts(0x1),
>>>>  		       "expect overflow interrupt on even counter");
>>>> -- 
>>>> 2.41.0
>>>>





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux