* David Hildenbrand <david@xxxxxxxxxx> [220504 11:31]: > On 04.05.22 09:37, Janosch Frank wrote: > > On 5/3/22 23:55, Liam Howlett wrote: > >> * Heiko Carstens <hca@xxxxxxxxxxxxx> [220503 15:49]: > >>> On Mon, May 02, 2022 at 08:50:04PM +0200, Heiko Carstens wrote: > >>>> On Mon, May 02, 2022 at 01:31:00PM +0000, Liam Howlett wrote: > >>>>> * Heiko Carstens <hca@xxxxxxxxxxxxx> [220502 06:18]: > >>>>>> On Sun, May 01, 2022 at 05:24:12PM -0700, Andrew Morton wrote: > >>>>>>> (cc S390 maintainers) > >>>>>>> (cc stable & Greg) > >>> ... > >>>>>>>> booting. The last thing I see is: > >>>>>>>> > >>>>>>>> "[ 4.668916] Spectre V2 mitigation: execute trampolines" > >>>>>>>> > >>>>>>>> I've bisected back to commit e553f62f10d9 (mm, page_alloc: fix > >>>>>>>> build_zonerefs_node()) > >>>>>>>> > >>>>>>>> With the this commit, I am unable to boot one out of three times. When > >>>>>>>> using the previous commit I was not able to get it to hang after trying > >>>>>>>> 10+ times. This is a qemu s390 install with KASAN on and I see no error > >>>>>>>> messages. I think it's likely it is this patch, but no guaranteed. > >>> ... > >>>>>> Liam, could you share your kernel config? > >>>>> > >>>>> Sure thing. See attached. > >>>> > >>>> So, I can reproduce the hanging system now. However this looks like a > >>>> qemu problem on s390, since I can reproduce this only with Qemu+TCG. > >>>> Qemu with kvm works without any problems (same if I use z/VM as > >>>> hypervisor). > >>>> > >>>> Janosch, Claudio, can you have a look at this please? > >>> > >>> So, at least for me this problem also exists with plain v5.17. > >>> Switching off KASAN, or alternatively switching to KASAN_INLINE > >>> "fixes" it for me with Qemu+TCG. > >>> > >>> Liam, could you please also try to disable KASAN in your config? With > >>> that I think we can be almost sure this could be some bug in Qemu. > >> > >> With KASAN, my tree fails 100% of the time (mm-stable + my maple tree > >> patches) > >> > >> Without KASAN, it boots 100% of the time. > >> > >> I think this verifies with you say above? > >> > >> Thanks, > >> Liam > > > > I had a short look yesterday and the boot usually hangs in the raid6 > > code. Disabling vector instructions didn't make a difference but a few > > interruptions via GDB solve the problem for some reason. > > > > CCing David and Thomas for TCG > > > > I somehow recall that KASAN was always disabled under TCG, I might be > wrong (I thought we'd get a message early during boot that the HW > doesn't support KASAN). > > I recall that raid code is a heavy user of vector instructions. > > How can I reproduce? Compile upstream (or -next?) with kasan support and > run it under TCG? Initially, I found that e553f62f10d9 in mm-stable had this issue. This looks to be in v5.18-rc5, at least. So upstream + kasan should work afaik - but I was using qemu and not TCG. Thanks, Liam