On 6/18/22 7:09 AM, Hao Xu wrote: > On 6/18/22 20:50, Jens Axboe wrote: >> On 6/18/22 6:47 AM, Hao Xu wrote: >>> On 6/18/22 19:34, Jens Axboe wrote: >>>> On 6/18/22 5:02 AM, Hao Xu wrote: >>>>> On 6/17/22 21:45, Jens Axboe wrote: >>>>>> With IORING_OP_MSG_RING, one ring can send a message to another ring. >>>>>> Extend that support to also allow sending a fixed file descriptor to >>>>>> that ring, enabling one ring to pass a registered descriptor to another >>>>>> one. >>>>>> >>>>>> Arguments are extended to pass in: >>>>>> >>>>>> sqe->addr3 fixed file slot in source ring >>>>>> sqe->file_index fixed file slot in destination ring >>>>>> >>>>>> IORING_OP_MSG_RING is extended to take a command argument in sqe->addr. >>>>>> If set to zero (or IORING_MSG_DATA), it sends just a message like before. >>>>>> If set to IORING_MSG_SEND_FD, a fixed file descriptor is sent according >>>>>> to the above arguments. >>>>>> >>>>>> Undecided: >>>>>> - Should we post a cqe with the send, or require that the sender >>>>>> just link a separate IORING_OP_MSG_RING? This makes error >>>>>> handling easier, as we cannot easily retract the installed >>>>>> file descriptor if the target CQ ring is full. Right now we do >>>>>> fill a CQE. If the request completes with -EOVERFLOW, then the >>>>>> sender must re-send a CQE if the target must get notified. >>>>> >>>>> Hi Jens, >>>>> Since we are have open/accept direct feature, this may be useful. But I >>>>> just can't think of a real case that people use two rings and need to do >>>>> operations to same fd. >>>> >>>> The two cases that people bring up as missing for direct descriptors >>>> that you can currently do with a real fd is: >>>> >>>> 1) Server needs to be shutdown or restarted, pass file descriptors to >>>> another onei >>>> >>>> 2) Backend is split, and one accepts connections, while others then get >>>> the fd passed and handle the actual connection. >>>> >>>> Both of those are classic SCM_RIGHTS use cases, and it's not possible to >>>> support them with direct descriptors today. >>> >>> I see, thanks for detail explanation. >> >> I should put that in the commit message in fact. Will do so. >> >>>>> Assume there are real cases, then filling a cqe is necessary since users >>>>> need to first make sure the desired fd is registered before doing >>>>> something to it. >>>> >>>> Right, my quesion here was really whether it should be bundled with the >>>> IORING_MSG_SEND_FD operation, or whether the issuer of that should also >>>> be responsible for then posting a "normal" IORING_OP_MSG_SEND to the >>>> target ring to notify it if the fact that an fd has been sent to it. >>>> >>>> If the operation is split like the latter, then it makes the error >>>> handling a bit easier as we eliminate one failing part of the existing >>>> MSG_SEND_FD. >>>> >>>> You could then also pass a number of descriptors and then post a single >>>> OP_MSG_SEND with some data that tells you which descriptors were passed. > > [1] > >>>> >>>> For the basic use case of just passing a single descriptor, what the >>>> code currently does is probably the sanest approach - send the fd, post >>>> a cqe. > > I think it's fine to keep it like this, since we can achieve [1] by a > GROUP_DELIVER flag and set cqe_skip flag for send msg request when it > turns out [1] is indeed necessary. The expected use case is probably CQE_SKIP for using this, as the sender doesn't care about being notified about a successful send. But for the target CQE, we'd then need to either have CQE_SKIP implying that we should skip CQE delivery there too, or we'd need to add an IORING_OP_MSG_RING flag for that. I think the latter is the cleaner approach, and it would indeed then allow both use cases. If you're sending a bunch of fds and would prefer to notify with a single OP_MSG_RING when they are done, then you'd set that OP_MSG_RING flag that says "don't post a CQE to the target". Hence my proposal would be to keep the CQE delivery by default as it stands in the patch, and add a flag for controlling whether or not OP_MSG_RING with MSG_SEND posts a CQE to the target ring or not. Agree? -- Jens Axboe