Re: io_uring process termination/killing is not working

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/12/20 12:28 PM, Pavel Begunkov wrote:
> On 12/08/2020 21:22, Pavel Begunkov wrote:
>> On 12/08/2020 21:20, Pavel Begunkov wrote:
>>> On 12/08/2020 21:05, Jens Axboe wrote:
>>>> On 8/12/20 11:58 AM, Josef wrote:
>>>>> Hi,
>>>>>
>>>>> I have a weird issue on kernel 5.8.0/5.8.1, SIGINT even SIGKILL
>>>>> doesn't work to kill this process(always state D or D+), literally I
>>>>> have to terminate my VM because even the kernel can't kill the process
>>>>> and no issue on 5.7.12-201, however if IOSQE_IO_LINK is not set, it
>>>>> works
>>>>>
>>>>> I've attached a file to reproduce it
>>>>> or here
>>>>> https://gist.github.com/1Jo1/15cb3c63439d0c08e3589cfa98418b2c
>>>>
>>>> Thanks, I'll take a look at this. It's stuck in uninterruptible
>>>> state, which is why you can't kill it.
>>>
>>> It looks like one of the hangs I've been talking about a few days ago,
>>> an accept is inflight but can't be found by cancel_files() because it's
>>> in a link.
>>
>> BTW, I described it a month ago, there were more details.
> 
> https://lore.kernel.org/io-uring/34eb5e5a-8d37-0cae-be6c-c6ac4d85b5d4@xxxxxxxxx

Yeah I think you're right. How about something like the below? That'll
potentially cancel more than just the one we're looking for, but seems
kind of silly to only cancel from the file table holding request and to
the end.


diff --git a/fs/io_uring.c b/fs/io_uring.c
index 8a2afd8c33c9..0630a9622baa 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4937,6 +5003,7 @@ static bool io_poll_remove_one(struct io_kiocb *req)
 		io_cqring_fill_event(req, -ECANCELED);
 		io_commit_cqring(req->ctx);
 		req->flags |= REQ_F_COMP_LOCKED;
+		req_set_fail_links(req);
 		io_put_req(req);
 	}
 
@@ -7935,6 +8002,47 @@ static bool io_wq_files_match(struct io_wq_work *work, void *data)
 	return work->files == files;
 }
 
+static bool __io_poll_remove_link(struct io_kiocb *preq, struct io_kiocb *req)
+{
+	struct io_kiocb *link;
+
+	if (!(preq->flags & REQ_F_LINK_HEAD))
+		return false;
+
+	list_for_each_entry(link, &preq->link_list, link_list) {
+		if (link != req)
+			break;
+		io_poll_remove_one(preq);
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * We're looking to cancel 'req' because it's holding on to our files, but
+ * 'req' could be a link to another request. See if it is, and cancel that
+ * parent request if so.
+ */
+static void io_poll_remove_link(struct io_ring_ctx *ctx, struct io_kiocb *req)
+{
+	struct hlist_node *tmp;
+	struct io_kiocb *preq;
+	int i;
+
+	spin_lock_irq(&ctx->completion_lock);
+	for (i = 0; i < (1U << ctx->cancel_hash_bits); i++) {
+		struct hlist_head *list;
+
+		list = &ctx->cancel_hash[i];
+		hlist_for_each_entry_safe(preq, tmp, list, hash_node) {
+			if (__io_poll_remove_link(preq, req))
+				break;
+		}
+	}
+	spin_unlock_irq(&ctx->completion_lock);
+}
+
 static void io_uring_cancel_files(struct io_ring_ctx *ctx,
 				  struct files_struct *files)
 {
@@ -7989,6 +8097,8 @@ static void io_uring_cancel_files(struct io_ring_ctx *ctx,
 			}
 		} else {
 			io_wq_cancel_work(ctx->io_wq, &cancel_req->work);
+			/* could be a link, check and remove if it is */
+			io_poll_remove_link(ctx, cancel_req);
 			io_put_req(cancel_req);
 		}
 

-- 
Jens Axboe




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux