On 8/13/21 8:00 PM, Josh Triplett wrote: > On Fri, Aug 13, 2021 at 05:43:09PM +0100, Pavel Begunkov wrote: >> Add an optional feature to open/accept directly into io_uring's fixed >> file table bypassing the normal file table. Same behaviour if as the >> snippet below, but in one operation: >> >> sqe = prep_[open,accept](...); >> cqe = submit_and_wait(sqe); >> // error handling >> io_uring_register_files_update(uring_idx, (fd = cqe->res)); >> // optionally >> close((fd = cqe->res)); >> >> The idea in pretty old, and was brough up and implemented a year ago >> by Josh Triplett, though haven't sought the light for some reasons. > > Thank you for working to get this over the finish line! > >> Tested on basic cases, will be sent out as liburing patches later. >> >> A copy paste from 2/2 describing user API and some notes: >> >> The behaviour is controlled by setting sqe->file_index, where 0 implies >> the old behaviour. If non-zero value is specified, then it will behave >> as described and place the file into a fixed file slot >> sqe->file_index - 1. A file table should be already created, the slot >> should be valid and empty, otherwise the operation will fail. >> >> Note 1: we can't use IOSQE_FIXED_FILE to switch between modes, because >> accept takes a file, and it already uses the flag with a different >> meaning. >> >> Note 2: it's u16, where in theory the limit for fixed file tables might >> get increased in the future. If would ever happen so, we'll better >> workaround later, e.g. by making ioprio to represent upper bits 16 bits. >> The layout for open is tight already enough. > > Rather than using sqe->file_index - 1, which feels like an error-prone > interface, I think it makes sense to use a dedicated flag for this, like > IOSQE_OPEN_FIXED. That flag could work for any open-like operation, > including open, accept, and in the future many other operations such as > memfd_create. (Imagine using a single ring submission to open a memfd, > write a buffer into it, seal it, send it over a UNIX socket, and then > close it.) > > The only downside is that you'll need to reject that flag in all > non-open operations. One way to unify that code might be to add a flag > in io_op_def for open-like operations, and then check in common code for > the case of non-open-like operations passing IOSQE_OPEN_FIXED. io_uring is really thin, and so I absolutely don't want any extra overhead in the generic path, IOW anything affecting reads/writes/sends/recvs. The other reason is that there are only 2 bits left in sqe->flags, and we may use them for something better, considering that it's only open/accept and not much as this. I agree that it feels error-prone, but at least it can be wrapped nicely enough in liburing, e.g. void io_uring_prep_openat_direct(struct io_uring_sqe *sqe, int dfd, const char *path, int flags, mode_t mode, int slot_idx); > Also, rather than using a 16-bit index for the fixed file table and > potentially requiring expansion into a different field in the future, > what about overlapping it with the nofile field in the open and accept > requests? If they're not opening a normal file descriptor, they don't > need nofile. And in the original sqe, you can then overlap it with a > 32-bit field like splice_fd_in. There is no nofile in SQEs, though req->open.nofile = rlimit(RLIMIT_NOFILE); > EEXIST seems like the wrong error-code to use if the index is already in > use; open can already return EEXIST if you pass O_EXCL. How about EBADF, > or better yet EBADSLT which is unlikely to be returned for any other > reason? Sure, sounds better indeed! -- Pavel Begunkov