On 23/02/2021 08:14, Stefan Metzmacher wrote: > Am 22.02.21 um 21:14 schrieb Jens Axboe: >> On 2/22/21 1:04 PM, Stefan Metzmacher wrote: >> For example, you start the IO operation and you'll get a notification (eg IRQ) later on which allows >> you to complete it. > > Yes, it's up to the implementation of uring_cmd() to do the processing and waiting > in the background, based on timers, hardware events or whatever and finally call > io_uring_cmd_done(). > > But with this: > > ret = file->f_op->uring_cmd(&req->uring_cmd, issue_flags); > /* queued async, consumer will call io_uring_cmd_done() when complete */ > if (ret == -EIOCBQUEUED) > return 0; > io_uring_cmd_done(&req->uring_cmd, ret); > return 0; > > I don't see where -EAGAIN would trigger a retry in a io-wq worker context. > Isn't -EAGAIN exposed to the cqe. Similar to ret == -EAGAIN && req->flags & REQ_F_NOWAIT. if (ret == -EAGAIN && (issue_flags & IO_URING_F_NONBLOCK)) return -EAGAIN; Yes, something alike. Apparently it was just forgotten >>> It's also not clear if IOSQE_ASYNC should have any impact. >> >> Handler doesn't need to care about that, it'll just mean that the >> initial queue attempt will not have IO_URING_F_NONBLOCK set. > > Ok, because it's done from the io-wq worker, correct? Currently, IO_URING_F_NONBLOCK isn't set only when executed from io-wq, may change for any reason though. Actually, ASYNC req may get executed with IO_URING_F_NONBLOCK, but handlers should be sane. >> So tldr here is that 1+2 is already there, and 3 not being fixed leaves >> us no different than the existing support for cancelation. IOW, I don't >> think this is an initial requirement, support can always be expanded >> later. > > Given that you list 2) here again, I get the impression that the logic should be: > > ret = file->f_op->uring_cmd(&req->uring_cmd, issue_flags); > /* reschedule in io-wq worker again */ > if (ret == -EAGAIN) > return ret; Yes, kind of > /* queued async, consumer will call io_uring_cmd_done() when complete */ > if (ret == -EIOCBQUEUED) > return 0; > io_uring_cmd_done(&req->uring_cmd, ret); > return 0; > > With that the above would make sense and seems to make the whole design more flexible > for the uring_cmd implementers. > > However my primary use case would be using the -EIOCBQUEUED logic. > And I think it would be good to have IORING_OP_ASYNC_CANCEL logic in place for that, > as it would simplify the userspace logic to single io_uring_opcode_supported(IO_OP_URING_CMD). > > I also noticed that some sendmsg/recvmsg implementations support -EIOCBQUEUED, e.g. _aead_recvmsg(), > I guess it would be nice to support that for IORING_OP_SENDMSG and IORING_OP_RECVMSG as well. > It uses struct kiocb and iocb->ki_complete(). It's just crypto stuff, IMHO unless something more useful like TCP/UDP sockets start supporting it, it isn't worth it. > > Would it make sense to also use struct kiocb and iocb->ki_complete() instead of > a custom io_uring_cmd_done()? > > Maybe it would be possible to also have a common way to cancel an struct kiocb request... > >>>> Since we just need that one branch in req init, I do think that your >>>> suggestion of just modifying io_uring_sqe is the way to go. So that's >>>> what the above branch does. >>> >>> Thanks! I think it's much easier to handle the personality logic in >>> the core only. >>> >>> For fixed files or fixed buffers I think helper functions like this: >>> >>> struct file *io_uring_cmd_get_file(struct io_uring_cmd *cmd, int fd, bool fixed); >>> >>> And similar functions for io_buffer_select or io_import_fixed. >> >> I did end up retaining that, at least in its current state it's like you >> proposed. Only change is some packing on that very union, which should >> not be necessary, but due to fun arm reasons it is. > > I noticed that thanks! > > Do you also think a io_uring_cmd_get_file() would be useful? > My uring_cmd() implementation would need a 2nd struct file in order to > do something similar to a splice operation. And it might be good to > allow also fixed files to be used. > > Referencing fixed buffer may also be useful, I'm not 100% sure I'll need them, > but it would be good to be flexible and prototype various solutions. Right, there should be a set of API for different purposes, including getting resources. Also should be better integrated into prep/cleaning up/etc. bits. Even more, I think the approach should be modernised into two-step: 1. register your fd (or classes of fds) for further uring_cmd() in io_uring ctx, 2. use that pre-registered fds/state for actually issuing a command. That would open the way for pre-allocating memory in advance, pre-backing some stuff like iommu mappings, iovec/bvecs in specific cases, pinning pages, and so on. And also will get rid of virtual calls chaining, e.g. ->uring_cmd() { blk_mq->uring_blk_mq() { ... } We will just get the final callback in the state. Hopefully, I'll get to it to describe in details or even hack it up. -- Pavel Begunkov