On Fri, Jul 12, 2019 at 12:10:12PM +0200, Steffen Klassert wrote: > On Fri, Jul 12, 2019 at 06:06:36PM +0800, Herbert Xu wrote: > > Daniel Jordan <daniel.m.jordan@xxxxxxxxxx> wrote: > > > > > > CPU0 CPU1 > > > > > > padata_reorder padata_do_serial > > > LOAD reorder_objects // 0 > > > INC reorder_objects // 1 > > > padata_reorder > > > TRYLOCK pd->lock // failed > > > UNLOCK pd->lock > > > > I think this can't happen because CPU1 won't call padata_reorder > > at all as it's the wrong CPU so it gets pushed back onto a work > > queue which will go back to CPU0. Thanks for looking at this. When you say the wrong CPU, I think you're referring to the reorder_via_wq logic in padata_do_serial. If so, I think my explanation was unclear, because there were two padata jobs in flight and my tracepoints showed neither of them punted padata_reorder to a workqueue. Let me expand on this, hopefully it helps. pcrypt used CPU 5 for its callback CPU. The first job in question, with sequence number 16581, hashed to CPU 21 on my system. This is a more complete version of what led to the hanging modprobe command. modprobe (CPU2) kworker/21:1-293 (CPU21) kworker/5:2-276 (CPU5) -------------------------- ------------------------ ---------------------- <submit job, seq_nr=16581> ... padata_do_parallel queue_work_on(21, ...) <sleeps> padata_parallel_worker pcrypt_aead_dec padata_do_serial padata_reorder | padata_get_next // returns job, seq_nr=16581 | // serialize seq_nr=16581 | queue_work_on(5, ...) | padata_get_next // returns -EINPROGRESS // padata_reorder doesn't return yet! | | padata_serial_worker | | pcrypt_aead_serial | | <wake up modprobe> | | <worker finishes> <submit job, seq_nr=16582> | | ... | | padata_do_parallel | | queue_work_on(22, ...) | | (LOAD reorder_objects as 0 somewhere <sleeps> | | in here, before the INC) | | kworker/22:1-291 (CPU22) | | ------------------------ | | padata_parallel_worker | | pcrypt_aead_dec | | padata_do_serial | | // INC reorder_objects to 1 | | padata_reorder | | // trylock fail, CPU21 _should_ | | // serialize 16582 but doesn't | | <worker finishes> | // deletes pd->timer // padata_reorder returns <worker finishes> <keeps on sleeping lazily> My tracepoints showed CPU22 increased reorder_objects to 1 but CPU21 read it as 0. I think adding the full barrier guarantees the following ordering, and the memory model people can correct me if I'm wrong: CPU21 CPU22 ------------------------ -------------------------- UNLOCK pd->lock smp_mb() LOAD reorder_objects INC reorder_objects spin_unlock(&pqueue->reorder.lock) // release barrier TRYLOCK pd->lock So if CPU22 has incremented reorder_objects but CPU21 reads 0 for it, CPU21 should also have unlocked pd->lock so CPU22 can get it and serialize any remaining jobs.