On Mon, Nov 9, 2020 at 5:34 AM Chengguang Xu <cgxu519@xxxxxxxxxxxx> wrote: > > ---- 在 星期日, 2020-11-08 22:03:06 Chengguang Xu <cgxu519@xxxxxxxxxxxx> 撰写 ---- > > Introduce a helper ovl_wait_wb_inodes() to wait until all > > target upper inodes finish writeback. > > > > Signed-off-by: Chengguang Xu <cgxu519@xxxxxxxxxxxx> > > --- > > fs/overlayfs/super.c | 30 ++++++++++++++++++++++++++++++ > > 1 file changed, 30 insertions(+) > > > > diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c > > index e5607a908d82..9a535fc11221 100644 > > --- a/fs/overlayfs/super.c > > +++ b/fs/overlayfs/super.c > > @@ -255,6 +255,36 @@ static void ovl_put_super(struct super_block *sb) > > ovl_free_fs(ofs); > > } > > > > +void ovl_wait_wb_inodes(struct ovl_fs *ofs) > > +{ > > + LIST_HEAD(tmp_list); > > + struct ovl_inode *oi; > > + struct inode *upper; > > + > > + spin_lock(&ofs->syncfs_wait_list_lock); > > + list_splice_init(&ofs->syncfs_wait_list, &tmp_list); > > + > > + while (!list_empty(&tmp_list)) { > > + oi = list_first_entry(&tmp_list, struct ovl_inode, wait_list); > > + list_del_init(&oi->wait_list); > > + ihold(&oi->vfs_inode); > > Maybe I overlooked race condition with inode eviction, so still need to introduce > OVL_EVICT_PENDING flag just like we did in old syncfs efficiency patch series. > I am not sure why you added the ovl wait list. I think you misunderstood Jan's suggestion. I think what Jan meant is that ovl_sync_fs() should call wait_sb_inodes(upper_sb) to wait for writeback of ALL upper inodes after sync_filesystem() started writeback only on this ovl instance upper inodes. I am not sure if this is acceptable or not - it is certainly an improvement over current situation, but I have a feeling that on a large scale (many containers) it won't be enough. The idea was to keep it simple without over optimizing, since anyway you are going for the "correct" solution long term (ovl inode aops), so I wouldn't add the wait list. As long as the upper inode is still dirty, we can keep the ovl inode in cache, so the worst outcome is that drop_caches needs to get called twice before the ovl inode can be evicted, no? Thanks, Amir.