Re: [PATCH 8/8] io_uring: enable IORING_SETUP_ATTACH_WQ to attach to SQPOLL thread too

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/7/20 8:00 AM, Pavel Begunkov wrote:
> On 07/09/2020 11:56, Xiaoguang Wang wrote:
>> hi,
>>
>> First thanks for this patchset:) and I have some questions below:
>>
>> 1. Does a uring instance which has SQPOLL enabled but does not specify
>> a valid cpu is meaningful in real business product?
>> IMHO, in a real business product, cpu usage should be under strict management,
>> say a uring instance which has SQPOLL enabled but is not bound to fixed cpu,
>> then this uring instance's corresponding io_sq_thread could be scheduled to any
>> cpu(and can be re-scheduled many times), which may impact any other application
>> greatly because of cpu contention, especially this io_sq_thread is just doing
>> busy loop in its sq_thread_idle period time, so in a real business product, I
>> wonder whether SQPOLL is only meaningful if user specifies a fixed cpu core.
> 
> It is meaningful for a part of cases, for the other part you can set
> processes' affinities and pin each SQPOLL thread to a specific CPU, as
> you'd do even without this series. And that gives you more flexibilty,
> IMHO.

Right, and once we have "any ctx can use this poll thread", then percpu
poll threads could be a subset of that. So no reason that kind of
functionality couldn't be layered on top of that. 

You'd probably just need an extra SETUP flag for this. Xiaoguang, any
interest in attempting to layer that on top of that current code?

>> 2. Does IORING_SETUP_ATTACH_WQ is really convenient?
>> IORING_SETUP_ATTACH_WQ always needs to ensure a parent uring instance exists,
>> that means it also requires app to regulate the creation oder of uring instanes,
>> which maybe not convenient, just imagine how to integrate this new SQPOLL
>> improvements to fio util.
> 
> It may be not so convenient, but it's flexible enough and prevents
> isolation breaking issues. We've discussed that in the thread with
> your patches.

I think there's one minor issue there, and that is matching on just the
fd. That could cause different process rings to attach to the wrong one.
I think we need this on top to match the file as well. This is a bit
different on the SQPOLL side vs the io-wq workqueue.

Something like the below incremental should do it, makes it clear that
we need to be able to get at this file, and doesn't risk false attaching
globally. As a plus, it gets rid of that global list too, and makes it
consistent with io-wq for ATTACH_WQ.

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 0490edfcdd88..4bd18e01ae89 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -232,10 +232,6 @@ struct io_restriction {
 struct io_sq_data {
 	refcount_t		refs;
 
-	/* global sqd lookup */
-	struct list_head	all_sqd_list;
-	int			attach_fd;
-
 	/* ctx's that are using this sqd */
 	struct list_head	ctx_list;
 	struct list_head	ctx_new_list;
@@ -245,9 +241,6 @@ struct io_sq_data {
 	struct wait_queue_head	wait;
 };
 
-static LIST_HEAD(sqd_list);
-static DEFINE_MUTEX(sqd_lock);
-
 struct io_ring_ctx {
 	struct {
 		struct percpu_ref	refs;
@@ -7034,32 +7027,37 @@ static void io_put_sq_data(struct io_sq_data *sqd)
 			kthread_stop(sqd->thread);
 		}
 
-		mutex_lock(&sqd_lock);
-		list_del(&sqd->all_sqd_list);
-		mutex_unlock(&sqd_lock);
-
 		kfree(sqd);
 	}
 }
 
 static struct io_sq_data *io_attach_sq_data(struct io_uring_params *p)
 {
-	struct io_sq_data *sqd, *ret = ERR_PTR(-ENXIO);
+	struct io_ring_ctx *ctx_attach;
+	struct io_sq_data *sqd;
+	struct fd f;
 
-	mutex_lock(&sqd_lock);
-	list_for_each_entry(sqd, &sqd_list, all_sqd_list) {
-		if (sqd->attach_fd == p->wq_fd) {
-			refcount_inc(&sqd->refs);
-			ret = sqd;
-			break;
-		}
+	f = fdget(p->wq_fd);
+	if (!f.file)
+		return ERR_PTR(-ENXIO);
+	if (f.file->f_op != &io_uring_fops) {
+		fdput(f);
+		return ERR_PTR(-EINVAL);
 	}
-	mutex_unlock(&sqd_lock);
 
-	return ret;
+	ctx_attach = f.file->private_data;
+	sqd = ctx_attach->sq_data;
+	if (!sqd) {
+		fdput(f);
+		return ERR_PTR(-EINVAL);
+	}
+
+	refcount_inc(&sqd->refs);
+	fdput(f);
+	return sqd;
 }
 
-static struct io_sq_data *io_get_sq_data(struct io_uring_params *p, int ring_fd)
+static struct io_sq_data *io_get_sq_data(struct io_uring_params *p)
 {
 	struct io_sq_data *sqd;
 
@@ -7071,15 +7069,11 @@ static struct io_sq_data *io_get_sq_data(struct io_uring_params *p, int ring_fd)
 		return ERR_PTR(-ENOMEM);
 
 	refcount_set(&sqd->refs, 1);
-	sqd->attach_fd = ring_fd;
 	INIT_LIST_HEAD(&sqd->ctx_list);
 	INIT_LIST_HEAD(&sqd->ctx_new_list);
 	mutex_init(&sqd->ctx_lock);
 	init_waitqueue_head(&sqd->wait);
 
-	mutex_lock(&sqd_lock);
-	list_add_tail(&sqd->all_sqd_list, &sqd_list);
-	mutex_unlock(&sqd_lock);
 	return sqd;
 }
 
@@ -7746,7 +7740,7 @@ static int io_init_wq_offload(struct io_ring_ctx *ctx,
 }
 
 static int io_sq_offload_create(struct io_ring_ctx *ctx,
-				struct io_uring_params *p, int ring_fd)
+				struct io_uring_params *p)
 {
 	int ret;
 
@@ -7757,7 +7751,7 @@ static int io_sq_offload_create(struct io_ring_ctx *ctx,
 		if (!capable(CAP_SYS_ADMIN) && !capable(CAP_SYS_NICE))
 			goto err;
 
-		sqd = io_get_sq_data(p, ring_fd);
+		sqd = io_get_sq_data(p);
 		if (IS_ERR(sqd)) {
 			ret = PTR_ERR(sqd);
 			goto err;
@@ -8972,7 +8966,7 @@ static int io_uring_create(unsigned entries, struct io_uring_params *p,
 		goto err;
 	}
 
-	ret = io_sq_offload_create(ctx, p, fd);
+	ret = io_sq_offload_create(ctx, p);
 	if (ret)
 		goto err;
 

-- 
Jens Axboe




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux