On Mon, 24 Jul 2023 at 11:05, Jens Axboe <axboe@xxxxxxxxx> wrote: > > io_uring never does that isn't the original user space creator task, or > from the io-wq workers that it may create. Those are _always_ normal > threads. There's no workqueue/kthread usage for IO or file > getting/putting/installing/removing etc. That's what I thought. But Christian's comments about the io_uring getdents work made me worry. If io_uring does everything right, then the old "file_count(file) > 1" test in __fdget_pos() - now sadly removed - should work just fine for any io_uring directory handling. It may not be obvious when you look at just that test in a vacuum, but it happens after __fdget_pos() has done the unsigned long v = __fdget(fd); struct file *file = (struct file *)(v & ~3); and if it's a threaded app - where io_uring threads count the same - then the __fdget() in that sequence will have incremented the file count. So when it then used to do that if (file_count(file) > 1) { that "> 1" included not just all the references from dup() etc, but for any threaded use where we have multiple references to the file table it also that reference we get from __fdget() -> __fget() -> __fget_files() -> __fget_files_rcu() -> get_file_rcu() -> atomic_long_inc_not_zero(&(x)->f_count)). And yes, that's a fairly long chain, through some subtle code. Doing "fdget()" (and variations) is not trivial. Of course, with directory handling, one of the things people have wanted is to have a "pread()" like model, where you have a position that is given externally and not tied to the 'struct file'. We don't have such a thing, and I don't think we can have one, because I think filesystems also end up having possibly other private data in the 'struct file' (trivially: the directory cursor for dcache_dir_open()). And other filesystems have it per-inode, which is why we have that "iterate_shared" vs "iterate" horror. So any dentry thing that wants to do things in parallel needs to open their own private 'file' just for *that* reason. And even then they will fail if the filesystem doesn't have that shared directory entry iterator. Linus