On Mon, Feb 15, 2021 at 12:28:26PM +0100, Sebastian Andrzej Siewior wrote: > On 2021-02-13 08:45:54 [-0800], Paul E. McKenney wrote: > > Glad you like it! But let's see which (if any) of these patches solves > > the problem for Sebastian. > > Looking at that, is there any reason for doing this that can not be > solved by moving the self-test a little later? Maybe once we reached at > least SYSTEM_SCHEDULING? One problem is that ksoftirqd and the kprobes use are early_initcall(), so we cannot count on ksoftirqd being spawned when kprobes first uses synchronize_rcu_tasks(). Moving the selftest later won't fix this problem, but rather just paper it over. > This happens now even before lockdep is up or the console is registered. > So if something bad happens, you end up with a blank terminal. I was getting a splat, but I could easily believe that there are configurations where the hang is totally silent. In other words, I do agree that this needs a proper fix. All we need do is work out an agreeable value of "proper". ;-) > There is nothing else that early in the boot process that requires > working softirq. The only exception to this is wait_task_inactive() > which is used while starting a new thread (including the ksoftirqd) > which is why it was moved to schedule_hrtimeout(). Moving kprobes initialization to early_initcall() [1] means that there can be a call to synchronize_rcu_tasks() before the current spawning of ksoftirqd. Because synchronize_rcu_tasks() needs timers to work, it needs softirq to work. I know two straightforward ways to make that happen: 1. Spawn ksoftirqd earlier. 2. Suppress attempts to awaken ksoftirqd before it exists, forcing all ksoftirq execution on the back of interrupts. Uladzislau and I each produced patches for #1, and I produced a patch for #2. The only other option I know of is to push the call to init_kprobes() later in the boot sequence, perhaps to its original subsys_initcall(), or maybe only as late as core_initcall(). I added Masami and Steve on CC for their thoughts on this. Is there some other proper fix that I am missing? Thanx, Paul [1] 36dadef23fcc ("kprobes: Init kprobes in early_initcall")