On Thu, Sep 22, 2022 at 12:55:37PM +0200, Steffen Klassert wrote: > On Wed, Sep 21, 2022 at 02:51:38PM -0400, Daniel Jordan wrote: > > On Wed, Sep 21, 2022 at 09:36:16AM +0200, Steffen Klassert wrote: > > > On Tue, Sep 20, 2022 at 10:10:57AM -0400, Daniel Jordan wrote: > > > > Yeah, padata_do_serial can be called with BHs off, like in the tipc > > > > stack, but there are also cases where BHs can be on, like lockdep said > > > > here: > > > > > > padata_do_serial was designed to run with BHs off, it is a bug if it > > > runs with BHs on. But I don't see a case where this can happen. The > > > only user of padata_do_serial is pcrypt in its serialization callbacks > > > (pcrypt_aead_enc, pcrypt_aead_dec) and the async crypto callback > > > pcrypt_aead_done. pcrypt_aead_enc and pcrypt_aead_dec are issued via > > > the padata_serial_worker with the padata->serial call. BHs are > > > off here. The crypto callback also runs with BHs off. > > > > > > What do I miss here? > > > > Ugh.. this newer, buggy part of padata_do_parallel: > > > > /* Maximum works limit exceeded, run in the current task. */ > > padata->parallel(padata); > > Oh well... > > > This skips the usual path in padata_parallel_worker, which disables BHs. > > They should be left off in the above case too. > > > > What about this? > > > > ---8<--- > > > > Subject: [PATCH] padata: always leave BHs disabled when running ->parallel() > > > > A deadlock can happen when an overloaded system runs ->parallel() in the > > context of the current task: > > > > padata_do_parallel > > ->parallel() > > pcrypt_aead_enc/dec > > padata_do_serial > > spin_lock(&reorder->lock) // BHs still enabled > > <interrupt> > > ... > > __do_softirq > > ... > > padata_do_serial > > spin_lock(&reorder->lock) > > > > It's a bug for BHs to be on in _do_serial as Steffen points out, so > > ensure they're off in the "current task" case like they are in > > padata_parallel_worker to avoid this situation. > > > > Reported-by: syzbot+bc05445bc14148d51915@xxxxxxxxxxxxxxxxxxxxxxxxx > > Fixes: 4611ce224688 ("padata: allocate work structures for parallel jobs from a pool") > > Signed-off-by: Daniel Jordan <daniel.m.jordan@xxxxxxxxxx> > > Yes, that makes sense. > > Acked-by: Steffen Klassert <steffen.klassert@xxxxxxxxxxx> Thanks. > But we also should look at the call to padata_find_next where BHs are > on. padata_find_next takes the same lock as padata_do_serial, so this > might be a candidate for a deadlock too. Yeah, that seems broken, it's now on my list of things to fix. Probably worth staring at the rest of the locking for a bit too.