On Thu, Dec 05, 2024 at 03:43:41PM +0100, Mateusz Guzik wrote: > On Thu, Dec 5, 2024 at 3:18 PM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > > > On Thu, Dec 05, 2024 at 01:03:32PM +0100, Mateusz Guzik wrote: > > > void fd_install(unsigned int fd, struct file *file) > > > { > > > - struct files_struct *files = current->files; > > > + struct files_struct *files; > > > struct fdtable *fdt; > > > > > > if (WARN_ON_ONCE(unlikely(file->f_mode & FMODE_BACKING))) > > > return; > > > > > > + /* > > > + * Synchronized with expand_fdtable(), see that routine for an > > > + * explanation. > > > + */ > > > rcu_read_lock_sched(); > > > + files = READ_ONCE(current->files); > > > > What are you trying to do with that READ_ONCE()? current->files > > itself is *not* changed by any of that code; current->files->fdtab is. > > To my understanding this is the idiomatic way of spelling out the > non-existent in Linux smp_consume_load, for the resize_in_progress > flag. In Linus, "smp_consume_load()" is named rcu_dereference(). > Anyway to elaborate I'm gunning for a setup where the code is > semantically equivalent to having a lock around the work. Except that rcu_read_lock_sched() provides mutual-exclusion guarantees only with later RCU grace periods, such as those implemented by synchronize_rcu(). > Pretend ->resize_lock exists, then: > fd_install: > files = current->files; > read_lock(files->resize_lock); > fdt = rcu_dereference_sched(files->fdt); > rcu_assign_pointer(fdt->fd[fd], file); > read_unlock(files->resize_lock); > > expand_fdtable: > write_lock(files->resize_lock); > [snip] > rcu_assign_pointer(files->fdt, new_fdt); > write_unlock(files->resize_lock); > > Except rcu_read_lock_sched + appropriately fenced resize_in_progress + > synchronize_rcu do it. OK, good, you did get the grace-period part of the puzzle. Howver, please keep in mind that synchronize_rcu() has significant latency by design. There is a tradeoff between CPU consumption and latency, and synchronize_rcu() therefore has latencies ranging upwards of several milliseconds (not microseconds or nanoseconds). I would be very surprised if expand_fdtable() users would be happy with such a long delay. Or are you using some trick to hide this delay?