Hello, On Tue, 23 May 2023 13:53:23 -0400 Mike Snitzer wrote: > > In order to improve the IO performance of the dm-crypt > > implementation, the commit 39d42fa96ba1 ("dm crypt: > > add flags to optionally bypass kcryptd workqueues") > > adds tasklet to do the crypto operations. > > > > The tasklet callback function kcryptd_crypt_tasklet() > > calls kcryptd_crypt() which is an original workqueue > > callback function that may sleep. As a result, the > > sleep-in-atomic-context bug may happen. The process > > is shown below. > > > > (atomic context) > > kcryptd_crypt_tasklet() > > kcryptd_crypt() > > kcryptd_crypt_write_convert() > > wait_for_completion() //may sleep > > > > The wait_for_completion() is a function that may sleep. > > In order to mitigate the bug, this patch changes > > wait_for_completion() to try_wait_for_completion(). > > > > Fixes: 39d42fa96ba1 ("dm crypt: add flags to optionally bypass kcryptd workqueues") > > Signed-off-by: Duoming Zhou <duoming@xxxxxxxxxx> > > --- > > drivers/md/dm-crypt.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c > > index 8b47b913ee8..5e2b2463d87 100644 > > --- a/drivers/md/dm-crypt.c > > +++ b/drivers/md/dm-crypt.c > > @@ -2085,7 +2085,8 @@ static void kcryptd_crypt_write_convert(struct dm_crypt_io *io) > > crypt_finished = atomic_dec_and_test(&ctx->cc_pending); > > if (!crypt_finished && kcryptd_crypt_write_inline(cc, ctx)) { > > /* Wait for completion signaled by kcryptd_async_done() */ > > - wait_for_completion(&ctx->restart); > > + while (!try_wait_for_completion(&ctx->restart)) > > + ; > > crypt_finished = 1; > > } > > > > -- > > 2.17.1 > > > > Cc'ing Ignat for closer review. > > But wasn't this already addressed with this commit?: > 8abec36d1274 dm crypt: do not wait for backlogged crypto request completion in softirq > > Mike Thank you for your review, I think the commit 8abec36d1274 ("dm crypt: do not wait for backlogged crypto request completion in softirq") could not solve this problem. When crypt_convert() returns BLK_STS_PROTECTION or BLK_STS_IOERR, meanwhile, there is request that is queued and wait kcryptd_async_done() to process. The workqueue callback function kcryptd_crypt_read_continue() will not be called. The variable crypt_finished equals to zero, it will call wait_for_completion() that may sleep. The relevant codes are shown below: static blk_status_t crypt_convert(...) { ... /* * The request is queued and processed asynchronously, * completion function kcryptd_async_done() will be called. */ case -EINPROGRESS: ctx->r.req = NULL; ctx->cc_sector += sector_step; tag_offset++; continue; ... /* * There was a data integrity error. */ case -EBADMSG: atomic_dec(&ctx->cc_pending); return BLK_STS_PROTECTION; /* * There was an error while processing the request. */ default: atomic_dec(&ctx->cc_pending); return BLK_STS_IOERR; } ... } static void kcryptd_crypt_write_convert(...) { ... r = crypt_convert(); //return BLK_STS_PROTECTION or BLK_STS_IOERR ... if (r == BLK_STS_DEV_RESOURCE) { //this condition is not satisfied, the workqueue will not be called. INIT_WORK(&io->work, kcryptd_crypt_write_continue); queue_work(cc->crypt_queue, &io->work); return; } ... // crypt_finished is zero, because there is request that is queued and wait kcryptd_async_done() to process. crypt_finished = atomic_dec_and_test(&ctx->cc_pending); if (!crypt_finished && kcryptd_crypt_write_inline(cc, ctx)) { /* Wait for completion signaled by kcryptd_async_done() */ wait_for_completion(&ctx->restart); //may sleep! ... } ... } Best regards, Duoming Zhou -- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel