On 31/03/2021 16:59, Xiaoguang Wang wrote: > We have already supported multiple rings to share one same poll thread > by passing IORING_SETUP_ATTACH_WQ, but it's not that convenient to use. > IORING_SETUP_ATTACH_WQ needs users to ensure that a parent ring instance > has beed created firstly, that means it will require app to regulate the > creation oder between uring instances. > > Currently we can make this a bit simpler, for those rings which will > have SQPOLL enabled and are willing to be bound to one same cpu, add a > capability that these rings can share one poll thread by specifying > a new IORING_SETUP_SQPOLL_PERCPU flag, then we have 3 cases > 1, IORING_SETUP_ATTACH_WQ: if user specifies this flag, we'll always > try to attach this ring to an existing ring's corresponding poll thread, > no matter whether IORING_SETUP_SQ_AFF or IORING_SETUP_SQPOLL_PERCPU is > set. > 2, IORING_SETUP_SQ_AFF and IORING_SETUP_SQPOLL_PERCPU are both enabled, > for this case, we'll create a single poll thread to be shared by rings > rings which have same sq_thread_cpu. > 3, for any other cases, we'll just create one new poll thread for the > corresponding ring. > > And for case 2, don't need to regulate creation oder of multiple uring > instances, we use a mutex to synchronize creation, for example, say five > rings which all have IORING_SETUP_SQ_AFF & IORING_SETUP_SQPOLL_PERCPU > enabled, and are willing to be bound same cpu, one ring that gets the > mutex lock will create one poll thread, the other four rings will just > attach themselves to the previous created poll thread once they get lock > successfully. > > To implement above function, define below data structs: > struct percpu_sqd_entry { > struct list_head node; > struct io_sq_data *sqd; > pid_t tgid; > }; > > struct percpu_sqd_list { > struct list_head head; > struct mutex lock; > }; > > static struct percpu_sqd_list __percpu *percpu_sqd_list; > > sqthreads that have same sq_thread_cpu will be linked together in a percpu > percpu_sqd_list's head. When IORING_SETUP_SQ_AFF and IORING_SETUP_SQPOLL_PERCPU > are both enabled, we will use struct io_uring_params' sq_thread_cpu and > current-tgid locate corresponding sqd. I can't help myself but wonder why not something in the userspace like a pseudo-coded snippet below? BTW, don't think "pid_t tgid" will work with namespaces/cgroups. static std::vector<std::set<struct io_uring *>> percpu_rings; static std::mutex lock; int io_uring_queue_init_params_percpu(unsigned entries, struct io_uring *ring, struct io_uring_params *p); { unsigned int cpu = p->sq_thread_cpu; std::unique_lock guard(lock); if (!(p->flags & IORING_SETUP_SQPOLL)) return -EINVAL; if (percpu_rings.size() <= cpu) percpu_rings.resize(cpu + 1); p->flags &= ~IORING_SETUP_ATTACH_WQ; if (!percpu_rings[cpu].empty()) { struct io_uring *shared_ring = *percpu_rings[cpu].begin(); p->wq_fd = shared_ring->ring_fd; p->flags |= IORING_SETUP_ATTACH_WQ; } int ret = io_uring_queue_init_params(entries, ring, p); if (!ret) percpu_rings[cpu].insert(ring); return ret; } void io_uring_queue_exit_percpu(struct io_uring *ring) { std::unique_lock guard(lock); for (auto& cpu_set: percpu_rings) if (cpu_set.erase(ring)) break; guard.release(); io_uring_queue_exit(ring); } -- Pavel Begunkov