Hi Alexandru, On 10/5/22 11:21, Alexandru Elisei wrote: > Hi Eric, > > On Tue, Oct 04, 2022 at 07:31:25PM +0200, Eric Auger wrote: >> Hi Alexandru, >> >> On 10/4/22 18:58, Alexandru Elisei wrote: >>> Hi Eric, >>> >>> On Tue, Oct 04, 2022 at 06:20:23PM +0200, Eric Auger wrote: >>>> Hi Ricardo, Marc, >>>> >>>> On 8/5/22 02:41, Ricardo Koller wrote: >>>>> There are some tests that fail when running on bare metal (including a >>>>> passthrough prototype). There are three issues with the tests. The >>>>> first one is that there are some missing isb()'s between enabling event >>>>> counting and the actual counting. This wasn't an issue on KVM as >>>>> trapping on registers served as context synchronization events. The >>>>> second issue is that some tests assume that registers reset to 0. And >>>>> finally, the third issue is that overflowing the low counter of a >>>>> chained event sets the overflow flag in PMVOS and some tests fail by >>>>> checking for it not being set. >>>>> >>>>> Addressed all comments from the previous version: >>>>> https://lore.kernel.org/kvmarm/20220803182328.2438598-1-ricarkol@xxxxxxxxxx/T/#t >>>>> - adding missing isb() and fixed the commit message (Alexandru). >>>>> - fixed wording of a report() check (Andrew). >>>>> >>>>> Thanks! >>>>> Ricardo >>>>> >>>>> Ricardo Koller (3): >>>>> arm: pmu: Add missing isb()'s after sys register writing >>>>> arm: pmu: Reset the pmu registers before starting some tests >>>>> arm: pmu: Check for overflow in the low counter in chained counters >>>>> tests >>>>> >>>>> arm/pmu.c | 56 ++++++++++++++++++++++++++++++++++++++----------------- >>>>> 1 file changed, 39 insertions(+), 17 deletions(-) >>>>> >>>> While testing this series and the related '[PATCH 0/9] KVM: arm64: PMU: >>>> Fixing chained events, and PMUv3p5 support' I noticed I have kvm unit >>>> test failures on some machines. This does not seem related to those >>>> series though since I was able to get them without. The failures happen >>>> on Amberwing machine for instance with the pmu-chain-promotion. >>>> >>>> While further investigating I noticed there is a lot of variability on >>>> the kvm unit test mem_access_loop() count. I can get the counter = 0x1F >>>> on the first iteration and 0x96 on the subsequent ones for instance. >>>> While running mem_access_loop(addr, 20, pmu.pmcr_ro | PMU_PMCR_E) I was >>>> expecting the counter to be close to 20. It is on some HW. >>>> >>>> [..] >>>> >>>> So I come to the actual question. Can we do any assumption on the >>>> (virtual) PMU quality/precision? If not, the tests I originally wrote >>>> are damned to fail on some HW (on some other they always pass) and I >>>> need to make a decision wrt re-writing part of them, expecially those >>>> which expect overflow after a given amount of ops. Otherwise, there is >>>> either something wrong in the test (asm?) or in KVM PMU emulation. >>>> >>>> I tried to bisect because I did observe the same behavior on some older >>>> kernels but the bisect was not successful as the issue does not happen >>>> always. >>>> >>>> Thoughts? >>> Looking at mem_access_loop(), the first thing that jumps out is the fact >>> that is missing a DSB barrier. ISB affects only instructions, not memory >>> accesses and without a DSB, the PE can reorder memory accesses however it >>> sees fit. >> Following your suggestion I added a dsh ish at the end of loop and >> before disabling pmcr_el0 (I hope this is the place you were thinking >> of) but unfortunately it does not seem to fix my issue. > Yes, DSB ISH after "b.gt 1b\n" and before the write to PMCR_EL0 that > disables the PMU. > > I think you also need a DSB ISH before the write to PMCR_EL0 that enables > the PMU in the first instruction of the asm block. In your example, the > MEM_ACCESS event count is higher than expected, and one explanation for the > large disparity that I can think of is that previous memory accesses are > reordered past the instruction that enables the PMU, which makes the PMU > add these events to the total event count. Makes sense. I added those at the 2 locations but unfortunately it does not change the result for me. > >>> I also believe precise_instrs_loop() to be in the same situation, as the >>> architecture doesn't guarantee that the cycle counter increments after >>> every CPU cycle (ARM DDI 0487I.a, page D11-5246): >>> >>> "Although the architecture requires that direct reads of PMCCNTR_EL0 or >>> PMCCNTR occur in program order, there is no requirement that the count >>> increments between two such reads. Even when the counter is incrementing on >>> every clock cycle, software might need check that the difference between >>> two reads of the counter is nonzero." >> OK >>> There's also an entire section in ARM DDI 0487I.a dedicated to this, titled >>> "A reasonable degree of inaccuracy" (page D11-5248). I'll post some >>> snippets that I found interesting, but there are more examples and >>> explanations to be found in that chapter. >> yeah I saw that, hence my question about the reasonable disparity we can >> expect from the HW/SW stack. >>> "In exceptional circumstances, such as a change in Security state or other >>> boundary condition, it is acceptable for the count to be inaccurate." >>> >>> PMCR writes are trapped by KVM. Is a change in exception level an >>> "exception circumstance"? Could be, but couldn't find anything definitive. >>> For example, the architecture allows an implementation to drop an event in >>> the case of an interrupt: >>> >>> "However, dropping a single branch count as the result of a rare >>> interaction with an interrupt is acceptable." >>> >>> So events could definitely be dropped because of an interrupt for the host. >>> >>> And there's also this: >>> >>> "The imprecision means that the counter might have counted an event around >>> the time the counter was disabled, but does not allow the event to be >>> observed as counted after the counter was disabled." >> In our case there seems to be a huge discrepancy. > I agree. There is this about the MEM_ACCESS event in the Arm ARM: > > "The counter counts each Memory-read operation or Memory-write operation > that the PE makes." > > As for what a Memory-read operation is (emphasis added by me): > > "A memory-read operation might be due to: > The result of an architecturally executed memory-reading instructions. > The result of a Speculatively executed memory-reading instructions <- this > is why the DSB ISH is needed before enabling the PMU. > **A translation table walk**." > > Those extra memory accesses might be caused by the table walker deciding to > walk the tables, speculatively or not. Software has no control over the > table walker (as long as it is enabled). That's indeed an interesting track. But can it be possible that for 20 expected load instructions we end up with ~150 actual memory accesses. I can't help thinking this is a quite surprising amount. Also the pattern is surprising: the first iteration gives low counter count (~30) while subsequent ones bring higher and constant ones (~150). I would have expected the opposite, no? I will try to run the same experience on various HW I have access to. Anyway there is a problem while interpreting the result of the tests. Either it can happen on some HW (it is a valid behavior according to the ARM spec) and the test is simply not runnable or it is a bug somewhere in the SW stack. It would be interesting to run the same tests at baremetal level on Amberwing and see what are the results. Ricardo/Drew, could you give some links about the setup? Thanks Eric > > Thanks, > Alex > >>> If you want my opinion, if it is necessary to count the number of events >>> for a test instead, I would define a margin of error on the number of >>> events counted. Or the test could be changed to check that at least one >>> such event was observed. >> I agree with you on the fact a reasonable margin must be observed and >> the tests may need to be rewritten to account for the observed disparity >> if considered "normal". Another way to proceed is to compute the >> disparity before launching the main tests and if too big, skip the main >> tests. Again on some HW, the counts are really 'as expected' and constant. >> >> Thanks! >> >> Eric >>> Thanks, >>> Alex >>>