On 10/7/20 10:44 AM, Greg KH wrote: > On Wed, Oct 07, 2020 at 09:36:23AM -0600, Jens Axboe wrote: >> Hi, >> >> Can you queue up this series for 5.4-stable? It fixes some issues >> with an earlier patch that was queued up for 5.4-stable. > > These aren't upstream, right? > > Are they a special 5.4-only stuff? Right, we had to make some 5.4 specific changes to fix various cases, hence these are only applicable to 5.4 as they are fixes for that fix... > And I need a signed-off-by: from you at the very least :) Oops for sure, here's an updated one :-) -- Jens Axboe
From MAILER-DAEMON Wed Oct 7 14:39:43 2020 From: Muchun Song <songmuchun@xxxxxxxxxxxxx> To: axboe@xxxxxxxxx, viro@xxxxxxxxxxxxxxxxxx Cc: linux-fsdevel@xxxxxxxxxxxxxxx, linux-block@xxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, zhuyinyin@xxxxxxxxxxxxx Subject: [PATCH v3 1/4] io_uring: Fix resource leaking when kill the process Date: Wed, 07 Oct 2020 11:16:32 +0800 Message-Id: <20201007031635.65295-2-songmuchun@xxxxxxxxxxxxx> In-Reply-To: <20201007031635.65295-1-songmuchun@xxxxxxxxxxxxx> References: <20201007031635.65295-1-songmuchun@xxxxxxxxxxxxx> List-ID: <linux-block.vger.kernel.org> X-Mailing-List: linux-block@xxxxxxxxxxxxxxx MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit From: Yinyin Zhu <zhuyinyin@xxxxxxxxxxxxx> The commit 1c4404efcf2c0> ("<io_uring: make sure async workqueue is canceled on exit>") doesn't solve the resource leak problem totally! When kworker is doing a io task for the io_uring, The process which submitted the io task has received a SIGKILL signal from the user. Then the io_cancel_async_work function could have sent a SIGINT signal to the kworker, but the judging condition is wrong. So it doesn't send a SIGINT signal to the kworker, then caused the resource leaking problem. Why the juding condition is wrong? The process is a multi-threaded process, we call the thread of the process which has submitted the io task Thread1. So the req->task is the current macro of the Thread1. when all the threads of the process have done exit procedure, the last thread will call the io_cancel_async_work, but the last thread may not the Thread1, so the task is not equal and doesn't send the SIGINT signal. To fix this bug, we alter the task attribute of the req with struct files_struct. And check the files instead. Fixes: 1c4404efcf2c0 ("io_uring: make sure async workqueue is canceled on exit") Signed-off-by: Yinyin Zhu <zhuyinyin@xxxxxxxxxxxxx> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> --- fs/io_uring.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 454cef93a39e8..2f46def7f5832 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -339,7 +339,7 @@ struct io_kiocb { u64 user_data; u32 result; u32 sequence; - struct task_struct *task; + struct files_struct *files; struct fs_struct *fs; @@ -513,7 +513,7 @@ static inline void io_queue_async_work(struct io_ring_ctx *ctx, } } - req->task = current; + req->files = current->files; spin_lock_irqsave(&ctx->task_lock, flags); list_add(&req->task_list, &ctx->task_list); @@ -2387,6 +2387,8 @@ static bool io_add_to_prev_work(struct async_list *list, struct io_kiocb *req) if (ret) { struct io_ring_ctx *ctx = req->ctx; + req->files = current->files; + spin_lock_irq(&ctx->task_lock); list_add(&req->task_list, &ctx->task_list); req->work_task = NULL; @@ -3717,7 +3719,7 @@ static int io_uring_fasync(int fd, struct file *file, int on) } static void io_cancel_async_work(struct io_ring_ctx *ctx, - struct task_struct *task) + struct files_struct *files) { if (list_empty(&ctx->task_list)) return; @@ -3729,7 +3731,7 @@ static void io_cancel_async_work(struct io_ring_ctx *ctx, req = list_first_entry(&ctx->task_list, struct io_kiocb, task_list); list_del_init(&req->task_list); req->flags |= REQ_F_CANCEL; - if (req->work_task && (!task || req->task == task)) + if (req->work_task && (!files || req->files == files)) send_sig(SIGINT, req->work_task, 1); } spin_unlock_irq(&ctx->task_lock); @@ -3754,7 +3756,7 @@ static int io_uring_flush(struct file *file, void *data) struct io_ring_ctx *ctx = file->private_data; if (fatal_signal_pending(current) || (current->flags & PF_EXITING)) - io_cancel_async_work(ctx, current); + io_cancel_async_work(ctx, data); return 0; } -- 2.11.0 From MAILER-DAEMON Wed Oct 7 14:39:43 2020 From: Muchun Song <songmuchun@xxxxxxxxxxxxx> To: axboe@xxxxxxxxx, viro@xxxxxxxxxxxxxxxxxx Cc: linux-fsdevel@xxxxxxxxxxxxxxx, linux-block@xxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, zhuyinyin@xxxxxxxxxxxxx, Muchun Song <songmuchun@xxxxxxxxxxxxx> Subject: [PATCH v3 2/4] io_uring: Fix missing smp_mb() in io_cancel_async_work() Date: Wed, 07 Oct 2020 11:16:33 +0800 Message-Id: <20201007031635.65295-3-songmuchun@xxxxxxxxxxxxx> In-Reply-To: <20201007031635.65295-1-songmuchun@xxxxxxxxxxxxx> References: <20201007031635.65295-1-songmuchun@xxxxxxxxxxxxx> List-ID: <linux-block.vger.kernel.org> X-Mailing-List: linux-block@xxxxxxxxxxxxxxx MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit The store to req->flags and load req->work_task should not be reordering in io_cancel_async_work(). We should make sure that either we store REQ_F_CANCE flag to req->flags or we see the req->work_task setted in io_sq_wq_submit_work(). Fixes: 1c4404efcf2c ("io_uring: make sure async workqueue is canceled on exit") Signed-off-by: Muchun Song <songmuchun@xxxxxxxxxxxxx> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> --- fs/io_uring.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 2f46def7f5832..5d9583e3d0d25 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -2252,6 +2252,12 @@ static void io_sq_wq_submit_work(struct work_struct *work) if (!ret) { req->work_task = current; + + /* + * Pairs with the smp_store_mb() (B) in + * io_cancel_async_work(). + */ + smp_mb(); /* A */ if (req->flags & REQ_F_CANCEL) { ret = -ECANCELED; goto end_req; @@ -3730,7 +3736,15 @@ static void io_cancel_async_work(struct io_ring_ctx *ctx, req = list_first_entry(&ctx->task_list, struct io_kiocb, task_list); list_del_init(&req->task_list); - req->flags |= REQ_F_CANCEL; + + /* + * The below executes an smp_mb(), which matches with the + * smp_mb() (A) in io_sq_wq_submit_work() such that either + * we store REQ_F_CANCEL flag to req->flags or we see the + * req->work_task setted in io_sq_wq_submit_work(). + */ + smp_store_mb(req->flags, req->flags | REQ_F_CANCEL); /* B */ + if (req->work_task && (!files || req->files == files)) send_sig(SIGINT, req->work_task, 1); } -- 2.11.0 From MAILER-DAEMON Wed Oct 7 14:39:43 2020 From: Muchun Song <songmuchun@xxxxxxxxxxxxx> To: axboe@xxxxxxxxx, viro@xxxxxxxxxxxxxxxxxx Cc: linux-fsdevel@xxxxxxxxxxxxxxx, linux-block@xxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, zhuyinyin@xxxxxxxxxxxxx, Muchun Song <songmuchun@xxxxxxxxxxxxx> Subject: [PATCH v3 3/4] io_uring: Fix remove irrelevant req from the task_list Date: Wed, 07 Oct 2020 11:16:34 +0800 Message-Id: <20201007031635.65295-4-songmuchun@xxxxxxxxxxxxx> In-Reply-To: <20201007031635.65295-1-songmuchun@xxxxxxxxxxxxx> References: <20201007031635.65295-1-songmuchun@xxxxxxxxxxxxx> List-ID: <linux-block.vger.kernel.org> X-Mailing-List: linux-block@xxxxxxxxxxxxxxx MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit If the process 0 has been initialized io_uring is complete, and then fork process 1. If process 1 exits and it leads to delete all reqs from the task_list. If we kill process 0. We will not send SIGINT signal to the kworker. So we can not remove the req from the task_list. The io_sq_wq_submit_work() can do that for us. Fixes: 1c4404efcf2c ("io_uring: make sure async workqueue is canceled on exit") Signed-off-by: Muchun Song <songmuchun@xxxxxxxxxxxxx> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> --- fs/io_uring.c | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 5d9583e3d0d25..c65f78f395655 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -2277,13 +2277,11 @@ static void io_sq_wq_submit_work(struct work_struct *work) break; cond_resched(); } while (1); -end_req: - if (!list_empty(&req->task_list)) { - spin_lock_irq(&ctx->task_lock); - list_del_init(&req->task_list); - spin_unlock_irq(&ctx->task_lock); - } } +end_req: + spin_lock_irq(&ctx->task_lock); + list_del_init(&req->task_list); + spin_unlock_irq(&ctx->task_lock); /* drop submission reference */ io_put_req(req); @@ -3727,15 +3725,16 @@ static int io_uring_fasync(int fd, struct file *file, int on) static void io_cancel_async_work(struct io_ring_ctx *ctx, struct files_struct *files) { + struct io_kiocb *req; + if (list_empty(&ctx->task_list)) return; spin_lock_irq(&ctx->task_lock); - while (!list_empty(&ctx->task_list)) { - struct io_kiocb *req; - req = list_first_entry(&ctx->task_list, struct io_kiocb, task_list); - list_del_init(&req->task_list); + list_for_each_entry(req, &ctx->task_list, task_list) { + if (files && req->files != files) + continue; /* * The below executes an smp_mb(), which matches with the @@ -3745,7 +3744,7 @@ static void io_cancel_async_work(struct io_ring_ctx *ctx, */ smp_store_mb(req->flags, req->flags | REQ_F_CANCEL); /* B */ - if (req->work_task && (!files || req->files == files)) + if (req->work_task) send_sig(SIGINT, req->work_task, 1); } spin_unlock_irq(&ctx->task_lock); -- 2.11.0 From MAILER-DAEMON Wed Oct 7 14:39:43 2020 From: Muchun Song <songmuchun@xxxxxxxxxxxxx> To: axboe@xxxxxxxxx, viro@xxxxxxxxxxxxxxxxxx Cc: linux-fsdevel@xxxxxxxxxxxxxxx, linux-block@xxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, zhuyinyin@xxxxxxxxxxxxx, Muchun Song <songmuchun@xxxxxxxxxxxxx>, Jiachen Zhang <zhangjiachen.jaycee@xxxxxxxxxxxxx> Subject: [PATCH v3 4/4] io_uring: Fix double list add in io_queue_async_work() Date: Wed, 07 Oct 2020 11:16:35 +0800 Message-Id: <20201007031635.65295-5-songmuchun@xxxxxxxxxxxxx> In-Reply-To: <20201007031635.65295-1-songmuchun@xxxxxxxxxxxxx> References: <20201007031635.65295-1-songmuchun@xxxxxxxxxxxxx> List-ID: <linux-block.vger.kernel.org> X-Mailing-List: linux-block@xxxxxxxxxxxxxxx MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit If we queue work in io_poll_wake(), it will leads to list double add. So we should add the list when the callback func is the io_sq_wq_submit_work. The following oops was seen: list_add double add: new=ffff9ca6a8f1b0e0, prev=ffff9ca62001cee8, next=ffff9ca6a8f1b0e0. ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:31! Call Trace: <IRQ> io_poll_wake+0xf3/0x230 __wake_up_common+0x91/0x170 __wake_up_common_lock+0x7a/0xc0 io_commit_cqring+0xea/0x280 ? blkcg_iolatency_done_bio+0x2b/0x610 io_cqring_add_event+0x3e/0x60 io_complete_rw+0x58/0x80 dio_complete+0x106/0x250 blk_update_request+0xa0/0x3b0 blk_mq_end_request+0x1a/0x110 blk_mq_complete_request+0xd0/0xe0 nvme_irq+0x129/0x270 [nvme] __handle_irq_event_percpu+0x7b/0x190 handle_irq_event_percpu+0x30/0x80 handle_irq_event+0x3c/0x60 handle_edge_irq+0x91/0x1e0 do_IRQ+0x4d/0xd0 common_interrupt+0xf/0xf Fixes: 1c4404efcf2c ("io_uring: make sure async workqueue is canceled on exit") Reported-by: Jiachen Zhang <zhangjiachen.jaycee@xxxxxxxxxxxxx> Signed-off-by: Muchun Song <songmuchun@xxxxxxxxxxxxx> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> --- fs/io_uring.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index c65f78f395655..a7cfe976480d8 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -513,12 +513,14 @@ static inline void io_queue_async_work(struct io_ring_ctx *ctx, } } - req->files = current->files; + if (req->work.func == io_sq_wq_submit_work) { + req->files = current->files; - spin_lock_irqsave(&ctx->task_lock, flags); - list_add(&req->task_list, &ctx->task_list); - req->work_task = NULL; - spin_unlock_irqrestore(&ctx->task_lock, flags); + spin_lock_irqsave(&ctx->task_lock, flags); + list_add(&req->task_list, &ctx->task_list); + req->work_task = NULL; + spin_unlock_irqrestore(&ctx->task_lock, flags); + } queue_work(ctx->sqo_wq[rw], &req->work); } @@ -667,6 +669,7 @@ static struct io_kiocb *io_get_req(struct io_ring_ctx *ctx, state->cur_req++; } + INIT_LIST_HEAD(&req->task_list); req->file = NULL; req->ctx = ctx; req->flags = 0; -- 2.11.0