Re: [PATCH v2] io_uring/fdinfo: remove need for sqpoll lock for thread/pid retrieval

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jens Axboe <axboe@xxxxxxxxx> writes:

> A previous commit added a trylock for getting the SQPOLL thread info via
> fdinfo, but this introduced a regression where we often fail to get it if
> the thread is busy. For that case, we end up not printing the current CPU
> and PID info.
>
> Rather than rely on this lock, just print the pid we already stored in
> the io_sq_data struct, and ensure we update the current CPU every time we
> are going to sleep. The latter won't potentially be 100% accurate, but
> that wasn't the case before either as the task can get migrated at any
> time unless it has been pinned at creation time.
>
> We retain keeping the io_sq_data dereference inside the ctx->uring_lock,
> as it has always been, as destruction of the thread and data happen below
> that. We could make this RCU safe, but there's little point in doing that.
>
> With this, we always print the last valid information we had, rather than
> have spurious outputs with missing information.
>
> Fixes: 7644b1a1c9a7 ("io_uring/fdinfo: lock SQ thread while retrieving thread cpu/pid")
> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
>
> ---
>
> v2: actually remember to use the cached values... also update ->sq_cpu
>     when we initially set it up, if it's not pinned to a given CPU.
>
> diff --git a/io_uring/fdinfo.c b/io_uring/fdinfo.c
> index f04a43044d91..976e9500f651 100644
> --- a/io_uring/fdinfo.c
> +++ b/io_uring/fdinfo.c
> @@ -145,13 +145,8 @@ __cold void io_uring_show_fdinfo(struct seq_file *m, struct file *f)
>  	if (has_lock && (ctx->flags & IORING_SETUP_SQPOLL)) {
>  		struct io_sq_data *sq = ctx->sq_data;
>  
> -		if (mutex_trylock(&sq->lock)) {
> -			if (sq->thread) {
> -				sq_pid = task_pid_nr(sq->thread);
> -				sq_cpu = task_cpu(sq->thread);
> -			}
> -			mutex_unlock(&sq->lock);
> -		}
> +		sq_pid = sq->task_pid;
> +		sq_cpu = sq->sq_cpu;
>  	}
>  
>  	seq_printf(m, "SqThread:\t%d\n", sq_pid);
> diff --git a/io_uring/sqpoll.c b/io_uring/sqpoll.c
> index bd6c2c7959a5..ecb00322a4e5 100644
> --- a/io_uring/sqpoll.c
> +++ b/io_uring/sqpoll.c
> @@ -229,10 +229,12 @@ static int io_sq_thread(void *data)
>  	snprintf(buf, sizeof(buf), "iou-sqp-%d", sqd->task_pid);
>  	set_task_comm(current, buf);
>  
> -	if (sqd->sq_cpu != -1)
> +	if (sqd->sq_cpu != -1) {
>  		set_cpus_allowed_ptr(current, cpumask_of(sqd->sq_cpu));
> -	else
> +	} else {
>  		set_cpus_allowed_ptr(current, cpu_online_mask);
> +		sqd->sq_cpu = task_cpu(current);
> +	}
>  
>  	mutex_lock(&sqd->lock);
>  	while (1) {
> @@ -291,6 +293,7 @@ static int io_sq_thread(void *data)
>  			}
>  
>  			if (needs_sched) {
> +				sqd->sq_cpu = task_cpu(current);

Don't you also need to update sqd->sq_cpu in io_sqd_handle_event before
releasing the lock?  sqpoll might get migrated after the following
schedule and then parked, in which case sqd->sq_cpu will not be up to
date for a while.

Other than that, I think the patch is fine.

>  				mutex_unlock(&sqd->lock);
>  				schedule();
>  				mutex_lock(&sqd->lock);

-- 
Gabriel Krisman Bertazi




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux