On Thu, Jan 31, 2019 at 10:54 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > On Thu, Jan 31, 2019 at 6:24 PM Miklos Szeredi <miklos@xxxxxxxxxx> wrote: > > > > There's some progress: it makes it through xfstests without crashing > > or deadlocking. There are still some regressing test cases, but > > things definitely look better. > > > > They do! > > kmemleak detected leaking ovl_readpage_work in ovl_readpage_work_fn() > and creator_cred in ovl_direct_IO(). > I pushed your patch with the leak fixes to ovl-aops-wip as a testing point. Great, thanks for testing. > > What remains is ctime/mtime support. > > > > Also fallocate/copyfile/reflink/etc will need some thought as we need > > to not only flush out dirty pages, but in some cases prevent new > > faults while the operation is in progress. > > Well, generic/503 which exercises these corners hangs > (503.dmesg attached) That was just a stupid bug in truncate (truncating ovl page with upper inode lock held -> ABBA deadlock). Attached the fix for that one. > > > > And after that it needs to be optimized (->readpages() and > > ->writepages()), but I think that's the easy part. > > > > And we also need to flush pages on non direct IO cases > of ->writepages() and then we can revert: > e8d4bfe3a715 ovl: Sync upper dirty data when syncing overlayfs > and swoop the grand prize :) Yeah. I think the behavior should be tunable (ovl page cache vs. upper page cache), at least initially, so it can be benchmarked in various situations to see if there are no slowdowns with the new code. Thanks, Miklos
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c index aecd288f7fe2..f954f1425186 100644 --- a/fs/overlayfs/file.c +++ b/fs/overlayfs/file.c @@ -853,7 +853,7 @@ static int ovl_real_readpage(struct file *realfile, struct page *page) iov_iter_bvec(&iter, READ, &bvec, 1, PAGE_SIZE); - ret = vfs_iter_read(realfile, &iter, &pos, 0); + ret = vfs_iter_read(realfile, &iter, &pos, RWF_DIRECT); if (ret >= 0 && ret < PAGE_SIZE) zero_user_segment(page, ret, PAGE_SIZE); diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c index e139f92b64ba..dabcc92dd178 100644 --- a/fs/overlayfs/inode.c +++ b/fs/overlayfs/inode.c @@ -76,15 +76,15 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr) old_cred = ovl_override_creds(dentry->d_sb); err = notify_change(upperdentry, attr, NULL); revert_creds(old_cred); - if (!err) { - if (is_truncate) - truncate_setsize(inode, i_size_read(upperinode)); + if (!err) ovl_copyattr(upperinode, inode); - } inode_unlock(upperinode); - if (is_truncate) + if (is_truncate) { + if (!err) + truncate_setsize(inode, i_size_read(upperinode)); put_write_access(upperinode); + } out_drop_write: ovl_drop_write(dentry); out: