On Thu, Mar 13, 2025 at 1:32 PM Mateusz Guzik <mjguzik@xxxxxxxxx> wrote: > > ... except when the table is known to be only used by one thread. > > A file pointer can get installed at any moment despite the ->file_lock > being held since the following: > 8a81252b774b53e6 ("fs/file.c: don't acquire files->file_lock in fd_install()") > > Accesses subject to such a race can in principle suffer load tearing. > > While here redo the comment in dup_fd() as it only covered a race against > files showing up, still assuming fd_install() takes the lock. > > Signed-off-by: Mateusz Guzik <mjguzik@xxxxxxxxx> > --- > > I confirmed the possiblity of the problem with this: > https://lwn.net/Articles/793253/#Load%20Tearing > > Granted, the article being 6 years old might mean some magic was added > by now to prevent this particular problem. > > While technically this classifies as a bugfix, given that nothing blew > up and this is more of a "just in case" change, I don't think this > warrants any backports. Thus I'm not adding a Fixes: tag to prevent this > from being picked by autosel. > > fs/file.c | 26 +++++++++++++++++--------- > 1 file changed, 17 insertions(+), 9 deletions(-) > > diff --git a/fs/file.c b/fs/file.c > index 6c159ede55f1..52010ecb27b8 100644 > --- a/fs/file.c > +++ b/fs/file.c > @@ -423,17 +423,25 @@ struct files_struct *dup_fd(struct files_struct *oldf, struct fd_range *punch_ho > old_fds = old_fdt->fd; > new_fds = new_fdt->fd; > > + /* > + * We may be racing against fd allocation from other threads using this > + * files_struct, despite holding ->file_lock. > + * > + * alloc_fd() might have already claimed a slot, while fd_install() > + * did not populate it yet. Note the latter operates locklessly, so > + * the file can show up as we are walking the array below. > + * > + * At the same time we know no files will disappear as all other > + * operations take the lock. > + * > + * Instead of trying to placate userspace racing with itself, we > + * ref the file if we see it and mark the fd slot as unused otherwise. > + */ > for (i = open_files; i != 0; i--) { > - struct file *f = *old_fds++; > + struct file *f = rcu_access_pointer(*old_fds++); sigh, that happens to work but is technically bogus -- I thought I did rcu_deference, but instead had rcu_access_pointer in my fingers from the assert thing. Thanks for Mathieu for noticing. That is to say the patch has to s/rcu_access_pointer/rcu_dereference. However, willy suggested also adding the check. So perhaps this can instead use the _check variant with lockdep_is_held(&fdt->file_lock) as the argument. I don't have an opinion on this bit -- the accesses are next to the lock acquire, so perhaps this only serves an uglifier. That said, if you want the assert, I'll post a v2. Otherwise please run the sed :-> > if (f) { > get_file(f); > } else { > - /* > - * The fd may be claimed in the fd bitmap but not yet > - * instantiated in the files array if a sibling thread > - * is partway through open(). So make sure that this > - * fd is available to the new process. > - */ > __clear_open_fd(open_files - i, new_fdt); > } > rcu_assign_pointer(*new_fds++, f); > @@ -684,7 +692,7 @@ struct file *file_close_fd_locked(struct files_struct *files, unsigned fd) > return NULL; > > fd = array_index_nospec(fd, fdt->max_fds); > - file = fdt->fd[fd]; > + file = rcu_access_pointer(fdt->fd[fd]); > if (file) { > rcu_assign_pointer(fdt->fd[fd], NULL); > __put_unused_fd(files, fd); > @@ -1252,7 +1260,7 @@ __releases(&files->file_lock) > */ > fdt = files_fdtable(files); > fd = array_index_nospec(fd, fdt->max_fds); > - tofree = fdt->fd[fd]; > + tofree = rcu_access_pointer(fdt->fd[fd]); > if (!tofree && fd_is_open(fd, fdt)) > goto Ebusy; > get_file(file); > -- > 2.43.0 > -- Mateusz Guzik <mjguzik gmail.com>