On Sun, Sep 24, 2023 at 12:28 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > [adding some folks from the multigrain ctime discussion] > > On Tue, Sep 12, 2023 at 8:36 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > > > ovl_copyattr() may be called concurrently from aio completion context > > without any lock and that could lead to overlay inode attributes getting > > permanently out of sync with real inode attributes. > > > > Similarly, ovl_file_accessed() is always called without any lock to do > > "compare & copy" of mtime/ctime from realinode to inode. > > > > Use ovl inode spinlock to protect those two helpers. > > > > Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx> > > --- > > fs/overlayfs/file.c | 2 ++ > > fs/overlayfs/util.c | 2 ++ > > 2 files changed, 4 insertions(+) > > > > diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c > > index 4193633c4c7a..c6ad84cf9246 100644 > > --- a/fs/overlayfs/file.c > > +++ b/fs/overlayfs/file.c > > @@ -249,6 +249,7 @@ static void ovl_file_accessed(struct file *file) > > if (!upperinode) > > return; > > > > + spin_lock(&inode->i_lock); > > ctime = inode_get_ctime(inode); > > uctime = inode_get_ctime(upperinode); > > if ((!timespec64_equal(&inode->i_mtime, &upperinode->i_mtime) || > > !timespec64_equal(&ctime, &uctime))) { > > inode->i_mtime = upperinode->i_mtime; > > inode_set_ctime_to_ts(inode, uctime); > > } > > + spin_unlock(&inode->i_lock); > > > > [patch manually edited to add missing line to context] > > Miklos, > > I am having latent concerns about this patch, which is currently > in overlayfs-next. > > What was your reason for the "compare & copy" optimization? > I assume it was to avoid cache line bouncing. yes? > Would the added spinlock make this optimization moot? > I am asking since I calculated that on X86-64 and with > CONFIG_FS_POSIX_ACL && CONFIG_SECURITY > i_ctime.tv_nsec and i_lock are on the same cache line. > > I should note for the non-overlayfs developers, that we do > not care to worry about changes to the upperinode that are > done behind the back of overlayfs. > > Ovrerlayfs assumes that it is the only one to make changes > to the upperinode, otherwise, behavior is undefined. > > This is the justification for only taking the overlayfs inode > i_lock for synchronization of this "compare & copy" routine. > > Also, I think we only need to care that the overlayfs inode > timestamps are eventually consistent, after the last > ovl_file_accessed() call. > > The decraled reason (in commit message) for adding the lock > here is to protect from races in the ovl aio code path, which was > not around when the ovl_file_accessed() helper was written. > > But now I am wondering: > - Is the lock needed in all the sync calls? > - Is it needed even if there was no aio at all? > > I think the answer is yes to both questions, so the patch > can remain in its current form, but I'm not 100% sure. > > > touch_atime(&file->f_path); > > } > > diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c > > index 89e0d60d35b6..b7922862ece3 100644 > > --- a/fs/overlayfs/util.c > > +++ b/fs/overlayfs/util.c > > @@ -1403,6 +1403,7 @@ void ovl_copyattr(struct inode *inode) > > realinode = ovl_i_path_real(inode, &realpath); > > real_idmap = mnt_idmap(realpath.mnt); > > > > + spin_lock(&inode->i_lock); > > vfsuid = i_uid_into_vfsuid(real_idmap, realinode); > > vfsgid = i_gid_into_vfsgid(real_idmap, realinode); > > > > inode->i_uid = vfsuid_into_kuid(vfsuid); > > inode->i_gid = vfsgid_into_kgid(vfsgid); > > inode->i_mode = realinode->i_mode; > > inode->i_mtime = realinode->i_mtime; > > inode_set_ctime_to_ts(inode, inode_get_ctime(realinode)); > > i_size_write(inode, i_size_read(realinode)); > > + spin_unlock(&inode->i_lock); > > My concerns about this part of the patch is that AFAIK, > all the calls of ovl_copyattr(), except for the one in aio completion, > are called with overlayfs inode mutex lock held. > > So this lock is strictly only needed because of the aio write case, > but I think we need to have the lock in place for all the other > cases to protect them against racing with aio completion? > > I guess the overhead of the spinlock is not a worry if the > mutex is already held, even though we do call ovl_copyattr() > twice (before and after) in some operations. > > Anyway, I just want to make sure that I did not make any > mistakes in my analysis of the problem and the fix. > FYI, for now I dropped this patch from overlayfs-next, because I was warned that ovl_copyattr() could be called from interrupt context in aio completion. I will bring it back after I sort out the ovl aio completion context. Thanks, Amir.