On Mon, 06 Mar 2023 22:08:04 +0000, Colton Lewis <coltonlewis@xxxxxxxxxx> wrote: > > Hi Marc, > > First of all, thanks for your previous responses to my comments. Many of > them clarified things I did not fully understand on my own. > > As I stated in another email, I've been testing this series on ECV > capable hardware. Things look good but I have been able to reproduce a > consistent assertion failure in this selftest when setting a > sufficiently large physical offset. I have so far not been able to > determine the cause of the failure and wonder if you have any insight as > to what might be causing this and how to debug. > > The following example reproduces the error every time I have tried: > > mvbbq9:/data/coltonlewis/ecv/arm64-obj/kselftest/kvm# > ./aarch64/arch_timer -O 0xffff > ==== Test Assertion Failure ==== > aarch64/arch_timer.c:239: false > pid=48094 tid=48095 errno=4 - Interrupted system call > 1 0x4010fb: test_vcpu_run at arch_timer.c:239 > 2 0x42a5bf: start_thread at pthread_create.o:0 > 3 0x46845b: thread_start at clone.o:0 > Failed guest assert: xcnt >= cval at aarch64/arch_timer.c:151 > values: 2500645901305, 2500645961845; 9939, vcpu 0; stage; 3; iter: 2 The fun part is that you can see similar things without the series: ==== Test Assertion Failure ==== aarch64/arch_timer.c:239: false pid=647 tid=651 errno=4 - Interrupted system call 1 0x00000000004026db: test_vcpu_run at arch_timer.c:239 2 0x00007fffb13cedd7: ?? ??:0 3 0x00007fffb1437e9b: ?? ??:0 Failed guest assert: config_iter + 1 == irq_iter at aarch64/arch_timer.c:188 values: 2, 3; 0, vcpu 3; stage; 4; iter: 3 That's on a vanilla kernel (6.2-rc4) on an M1 with the test run without any argument in a loop. After a few iterations, it blows. > > Observations: > > - Failure always occurs at stage 3 or 4 (physical timer stages) > - xcnt_diff_us is always slightly less than 10000, or 10 ms > - Reducing offset size reduces the probability of failure linearly (for > example, -O 0x8000 will fail close to half the time) > - Failure occurs with a wide range of different period values and > whether or not migrations happen The problem is that I don't understand enough of the test to make a judgement call. I hardly get *what* it is testing. Do you? Thanks, M. -- Without deviation from the norm, progress is not possible.