On 2024-12-06 08:05, Simon Horman wrote: > On Wed, Dec 04, 2024 at 09:21:48AM -0800, David Wei wrote: >> From: David Wei <davidhwei@xxxxxxxx> >> >> Add a new object called an interface queue (ifq) that represents a net >> rx queue that has been configured for zero copy. Each ifq is registered >> using a new registration opcode IORING_REGISTER_ZCRX_IFQ. >> >> The refill queue is allocated by the kernel and mapped by userspace >> using a new offset IORING_OFF_RQ_RING, in a similar fashion to the main >> SQ/CQ. It is used by userspace to return buffers that it is done with, >> which will then be re-used by the netdev again. >> >> The main CQ ring is used to notify userspace of received data by using >> the upper 16 bytes of a big CQE as a new struct io_uring_zcrx_cqe. Each >> entry contains the offset + len to the data. >> >> For now, each io_uring instance only has a single ifq. >> >> Signed-off-by: David Wei <dw@xxxxxxxxxxx> > > ... > >> diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c > > ... > >> +int io_register_zcrx_ifq(struct io_ring_ctx *ctx, >> + struct io_uring_zcrx_ifq_reg __user *arg) >> +{ >> + struct io_uring_zcrx_ifq_reg reg; >> + struct io_uring_region_desc rd; >> + struct io_zcrx_ifq *ifq; >> + size_t ring_sz, rqes_sz; >> + int ret; >> + >> + /* >> + * 1. Interface queue allocation. >> + * 2. It can observe data destined for sockets of other tasks. >> + */ >> + if (!capable(CAP_NET_ADMIN)) >> + return -EPERM; >> + >> + /* mandatory io_uring features for zc rx */ >> + if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN && >> + ctx->flags & IORING_SETUP_CQE32)) >> + return -EINVAL; >> + if (ctx->ifq) >> + return -EBUSY; >> + if (copy_from_user(®, arg, sizeof(reg))) >> + return -EFAULT; >> + if (copy_from_user(&rd, u64_to_user_ptr(reg.region_ptr), sizeof(rd))) >> + return -EFAULT; >> + if (memchr_inv(®.__resv, 0, sizeof(reg.__resv))) >> + return -EINVAL; >> + if (reg.if_rxq == -1 || !reg.rq_entries || reg.flags) >> + return -EINVAL; >> + if (reg.rq_entries > IO_RQ_MAX_ENTRIES) { >> + if (!(ctx->flags & IORING_SETUP_CLAMP)) >> + return -EINVAL; >> + reg.rq_entries = IO_RQ_MAX_ENTRIES; >> + } >> + reg.rq_entries = roundup_pow_of_two(reg.rq_entries); >> + >> + if (!reg.area_ptr) >> + return -EFAULT; >> + >> + ifq = io_zcrx_ifq_alloc(ctx); >> + if (!ifq) >> + return -ENOMEM; >> + >> + ret = io_allocate_rbuf_ring(ifq, ®, &rd); >> + if (ret) >> + goto err; >> + >> + ifq->rq_entries = reg.rq_entries; >> + ifq->if_rxq = reg.if_rxq; >> + >> + ring_sz = sizeof(struct io_uring); >> + rqes_sz = sizeof(struct io_uring_zcrx_rqe) * ifq->rq_entries; > > Hi David, > > A minor nit from my side: rqes_sz is set but otherwise unused in this > function. Perhaps it can be removed? > > Flagged by W=1 builds. Hi Simon, thanks for flagging this, I'll remove it in the next version.