On Fri, 11 Nov 2022 at 09:45, Arnd Bergmann <arnd@xxxxxxxx> wrote: > > On Fri, Nov 11, 2022, at 07:28, Naresh Kamboju wrote: > > On Thu, 10 Nov 2022 at 03:33, Arnd Bergmann <arnd@xxxxxxxx> wrote: > >> > >> One more idea I had is the unwinder: since this kernel is built > >> with the frame-pointer unwinder, I think the stack usage per > >> function is going to be slightly larger than with the arm unwinder. > >> > >> Naresh, how hard is it to reproduce this bug intentionally? > >> Can you try if it still happens if you change the .config to > >> use these:? > >> > >> # CONFIG_FUNCTION_GRAPH_TRACER is not set > >> # CONFIG_UNWINDER_FRAME_POINTER is not set > >> CONFIG_UNWINDER_ARM=y > > > > I have done this experiment and reported crash not reproduced > > after eight rounds of testing [1]. > > > > https://lkft.validation.linaro.org/scheduler/job/5835922#L1993 > > Ok, good to hear. In this case, I see three possible ways forward > to prevent this from coming back on your system: > > a) use asynchronous probing for one or more of the drivers as > Dmitry suggested. This means fixing it upstream first and then > backporting the fix to all stable kernels. We should probably > do this anyway, but this will need more testing on your side. > > b) Change your kernel config permanently with the options above, > if LKFT does not actually rely on CONFIG_FUNCTION_GRAPH_TRACER. > I don't know if it does. > > c) backport commit 41918ec82eb6 ("ARM: ftrace: enable the graph > tracer with the EABI unwinder") from 5.17. This was part of > a longer series from Ard, and while the patch itself looks > simple enough to be backported, I suspect we'd have to > backport the entire series, which is probably not going to > be realistic. Ard, any comments on this? > It at least needs the preceding patch, which tracks the location of LR on the stack when using CONFIG_UNWINDER_ARM. But I'd take the whole series for good measure.