Re: io_uring process termination/killing is not working

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 13/08/2020 02:32, Jens Axboe wrote:
> On 8/12/20 12:28 PM, Pavel Begunkov wrote:
>> On 12/08/2020 21:22, Pavel Begunkov wrote:
>>> On 12/08/2020 21:20, Pavel Begunkov wrote:
>>>> On 12/08/2020 21:05, Jens Axboe wrote:
>>>>> On 8/12/20 11:58 AM, Josef wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I have a weird issue on kernel 5.8.0/5.8.1, SIGINT even SIGKILL
>>>>>> doesn't work to kill this process(always state D or D+), literally I
>>>>>> have to terminate my VM because even the kernel can't kill the process
>>>>>> and no issue on 5.7.12-201, however if IOSQE_IO_LINK is not set, it
>>>>>> works
>>>>>>
>>>>>> I've attached a file to reproduce it
>>>>>> or here
>>>>>> https://gist.github.com/1Jo1/15cb3c63439d0c08e3589cfa98418b2c
>>>>>
>>>>> Thanks, I'll take a look at this. It's stuck in uninterruptible
>>>>> state, which is why you can't kill it.
>>>>
>>>> It looks like one of the hangs I've been talking about a few days ago,
>>>> an accept is inflight but can't be found by cancel_files() because it's
>>>> in a link.
>>>
>>> BTW, I described it a month ago, there were more details.
>>
>> https://lore.kernel.org/io-uring/34eb5e5a-8d37-0cae-be6c-c6ac4d85b5d4@xxxxxxxxx
> 
> Yeah I think you're right. How about something like the below? That'll
> potentially cancel more than just the one we're looking for, but seems
> kind of silly to only cancel from the file table holding request and to
> the end.

The bug is not poll/t-out related, IIRC my test reproduces it with
read(pipe)->open(). See the previously sent link.

As mentioned, I'm going to patch that up, if you won't beat me on that.

> 
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 8a2afd8c33c9..0630a9622baa 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -4937,6 +5003,7 @@ static bool io_poll_remove_one(struct io_kiocb *req)
>  		io_cqring_fill_event(req, -ECANCELED);
>  		io_commit_cqring(req->ctx);
>  		req->flags |= REQ_F_COMP_LOCKED;
> +		req_set_fail_links(req);
>  		io_put_req(req);
>  	}
>  
> @@ -7935,6 +8002,47 @@ static bool io_wq_files_match(struct io_wq_work *work, void *data)
>  	return work->files == files;
>  }
>  
> +static bool __io_poll_remove_link(struct io_kiocb *preq, struct io_kiocb *req)
> +{
> +	struct io_kiocb *link;
> +
> +	if (!(preq->flags & REQ_F_LINK_HEAD))
> +		return false;
> +
> +	list_for_each_entry(link, &preq->link_list, link_list) {
> +		if (link != req)
> +			break;
> +		io_poll_remove_one(preq);
> +		return true;
> +	}
> +
> +	return false;
> +}
> +
> +/*
> + * We're looking to cancel 'req' because it's holding on to our files, but
> + * 'req' could be a link to another request. See if it is, and cancel that
> + * parent request if so.
> + */
> +static void io_poll_remove_link(struct io_ring_ctx *ctx, struct io_kiocb *req)
> +{
> +	struct hlist_node *tmp;
> +	struct io_kiocb *preq;
> +	int i;
> +
> +	spin_lock_irq(&ctx->completion_lock);
> +	for (i = 0; i < (1U << ctx->cancel_hash_bits); i++) {
> +		struct hlist_head *list;
> +
> +		list = &ctx->cancel_hash[i];
> +		hlist_for_each_entry_safe(preq, tmp, list, hash_node) {
> +			if (__io_poll_remove_link(preq, req))
> +				break;
> +		}
> +	}
> +	spin_unlock_irq(&ctx->completion_lock);
> +}
> +
>  static void io_uring_cancel_files(struct io_ring_ctx *ctx,
>  				  struct files_struct *files)
>  {
> @@ -7989,6 +8097,8 @@ static void io_uring_cancel_files(struct io_ring_ctx *ctx,
>  			}
>  		} else {
>  			io_wq_cancel_work(ctx->io_wq, &cancel_req->work);
> +			/* could be a link, check and remove if it is */
> +			io_poll_remove_link(ctx, cancel_req);
>  			io_put_req(cancel_req);
>  		}
>  
> 

-- 
Pavel Begunkov



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux