On 2023-07-06 12:54, Linux regression tracking (Thorsten Leemhuis) wrote:
On 06.07.23 10:08, Forza wrote:
On Wed, May 24, 2023 at 11:13:57AM +0200, David Sterba wrote:
[...]
A small update.
Thx for this.
I have been able test 6.2.16, all 6.3.x and 6.4.1 and they all show
the same issue.
I am now trying 6.1.37 since two days and have not been able to
reproduce this issue on any of my virtual qemu/kvm machines. Perhaps
this information is helpful in finding the root cause?
That means it's most likely a regression between v6.1..v6.2 (or
v6.1..v6.2.16 if we are unlucky) somewhere (from earlier in the thread
it sounds like it might not be Btrfs).
Agreed, I do not think this specific bug (cpuidle / default_enter_idle
leaked IRQ state) is Btrfs related. Some of the virtual machines I test
on do not use Btrfs.
Which makes we wonder: how long do you usually need to reproduce the
issue? If it's not too long it might mean that a bisection is the best
way forward, unless some developer sits down and looks closely at the
logs. With a bit of luck some dev will do that; but if we are unlucky we
likely will need a bisection.
It has varied. Sometimes immediately upon boot, but can take several
hours or a day before showing up.
Also, I forgot to say I was basing my kernels on gentoo-kernels, which
has some patches against vanilla. Therefore I will I will compile a set
of vanilla kernels from 6.1.37 until 6.4.2 and run them in my testing
machines to see where the problem is happening.
This is not a fast system, so it will likely take several days. But I
will keep you posted.
Meanwhile, if you think of any specific kernel debug options, tracing,
etc, that I should enable, let me know
Should we change the Subject line of this email thread?
Thanks
~Forza
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.