On Wed, Oct 20, 2021 at 08:00:00AM +0200, Marco Elver wrote: > On Mon, 11 Oct 2021 at 16:42, Andrea Righi <andrea.righi@xxxxxxxxxxxxx> wrote: > > On Mon, Oct 11, 2021 at 12:03:52PM +0200, Marco Elver wrote: > > > On Mon, 11 Oct 2021 at 11:53, Andrea Righi <andrea.righi@xxxxxxxxxxxxx> wrote: > > > > On Mon, Oct 11, 2021 at 11:23:32AM +0200, Andrea Righi wrote: > > > > ... > > > > > > You seem to use the default 20s stall timeout. FWIW syzbot uses 160 > > > > > > secs timeout for TCG emulation to avoid false positive warnings: > > > > > > https://github.com/google/syzkaller/blob/838e7e2cd9228583ca33c49a39aea4d863d3e36d/dashboard/config/linux/upstream-arm64-kasan.config#L509 > > > > > > There are a number of other timeouts raised as well, some as high as > > > > > > 420 seconds. > > > > > > > > > > I see, I'll try with these settings and see if I can still hit the soft > > > > > lockup messages. > > > > > > > > Still getting soft lockup messages even with the new timeout settings: > > > > > > > > [ 462.663766] watchdog: BUG: soft lockup - CPU#2 stuck for 430s! [systemd-udevd:168] > > > > [ 462.755758] watchdog: BUG: soft lockup - CPU#3 stuck for 430s! [systemd-udevd:171] > > > > [ 924.663765] watchdog: BUG: soft lockup - CPU#2 stuck for 861s! [systemd-udevd:168] > > > > [ 924.755767] watchdog: BUG: soft lockup - CPU#3 stuck for 861s! [systemd-udevd:171] > > > > > > The lockups are expected if you're hitting the TCG bug I linked. Try > > > to pass '-enable-kvm' to the inner qemu instance (my bad if you > > > already have), assuming that's somehow easy to do. > > > > If I add '-enable-kvm' I can triggering other random panics (almost > > immediately), like this one for example: > > Just FYI: https://lkml.kernel.org/r/20211019102524.2807208-2-elver@xxxxxxxxxx > > But you can already flip that switch in your config > (CONFIG_KFENCE_STATIC_KEYS=n), which we recommend as a default now. > > As a side-effect it'd also make your QEMU TCG tests pass. Cool! Thanks for the update! And about the other panic that I was getting it seems to be fixed by this one: https://lore.kernel.org/lkml/YW6N2qXpBU3oc50q@arighi-desktop/T/#u -Andrea