Re: [RFC PATCH v3 09/10] ovl: introduce helper of syncfs writeback inode waiting

Chengguang Xu <cgxu519@xxxxxxxxxxxx> · Mon, 09 Nov 2020 20:06:56 +0800

 ---- 在 星期一, 2020-11-09 18:07:18 Amir Goldstein <amir73il@xxxxxxxxx> 撰写 ----
 > On Mon, Nov 9, 2020 at 10:34 AM Chengguang Xu <cgxu519@xxxxxxxxxxxx> wrote:
 > >
 > >  ---- 在 星期一, 2020-11-09 15:07:18 Amir Goldstein <amir73il@xxxxxxxxx> 撰写 ----
 > >  > On Mon, Nov 9, 2020 at 5:34 AM Chengguang Xu <cgxu519@xxxxxxxxxxxx> wrote:
 > >  > >
 > >  > >  ---- 在 星期日, 2020-11-08 22:03:06 Chengguang Xu <cgxu519@xxxxxxxxxxxx> 撰写 ----
 > >  > >  > Introduce a helper ovl_wait_wb_inodes() to wait until all
 > >  > >  > target upper inodes finish writeback.
 > >  > >  >
 > >  > >  > Signed-off-by: Chengguang Xu <cgxu519@xxxxxxxxxxxx>
 > >  > >  > ---
 > >  > >  >  fs/overlayfs/super.c | 30 ++++++++++++++++++++++++++++++
 > >  > >  >  1 file changed, 30 insertions(+)
 > >  > >  >
 > >  > >  > diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
 > >  > >  > index e5607a908d82..9a535fc11221 100644
 > >  > >  > --- a/fs/overlayfs/super.c
 > >  > >  > +++ b/fs/overlayfs/super.c
 > >  > >  > @@ -255,6 +255,36 @@ static void ovl_put_super(struct super_block *sb)
 > >  > >  >      ovl_free_fs(ofs);
 > >  > >  >  }
 > >  > >  >
 > >  > >  > +void ovl_wait_wb_inodes(struct ovl_fs *ofs)
 > >  > >  > +{
 > >  > >  > +    LIST_HEAD(tmp_list);
 > >  > >  > +    struct ovl_inode *oi;
 > >  > >  > +    struct inode *upper;
 > >  > >  > +
 > >  > >  > +    spin_lock(&ofs->syncfs_wait_list_lock);
 > >  > >  > +    list_splice_init(&ofs->syncfs_wait_list, &tmp_list);
 > >  > >  > +
 > >  > >  > +    while (!list_empty(&tmp_list)) {
 > >  > >  > +        oi = list_first_entry(&tmp_list, struct ovl_inode, wait_list);
 > >  > >  > +        list_del_init(&oi->wait_list);
 > >  > >  > +        ihold(&oi->vfs_inode);
 > >  > >
 > >  > > Maybe I overlooked race condition with inode eviction, so still need to introduce
 > >  > > OVL_EVICT_PENDING flag just like we did in old syncfs efficiency patch series.
 > >  > >
 > >  >
 > >  > I am not sure why you added the ovl wait list.
 > >  >
 > >  > I think you misunderstood Jan's suggestion.
 > >  > I think what Jan meant is that ovl_sync_fs() should call
 > >  > wait_sb_inodes(upper_sb)
 > >  > to wait for writeback of ALL upper inodes after sync_filesystem()
 > >  > started writeback
 > >  > only on this ovl instance upper inodes.
 > >  >
 > >
 > >
 > > Maybe you are right, the wait list is just for accuracy that can completely
 > > avoid interferes between ovl instances, otherwise we may need to face
 > > waiting interferes  in high density environment.
 > >
 > >
 > >  > I am not sure if this is acceptable or not - it is certainly an improvement over
 > >  > current situation, but I have a feeling that on a large scale (many
 > >  > containers) it
 > >  > won't be enough.
 > >  >
 > >
 > > The same as your thought.
 > >
 > >
 > >  > The idea was to keep it simple without over optimizing, since anyway
 > >  > you are going for the "correct" solution long term (ovl inode aops),
 > >  > so I wouldn't
 > >  > add the wait list.
 > >  >
 > >
 > > Maybe, I think it depends on how to implement ovl page-cache, so at current
 > > stage I have no idea for the wait list.
 > >
 > >
 > >  > As long as the upper inode is still dirty, we can keep the ovl inode in cache,
 > >  > so the worst outcome is that drop_caches needs to get called twice before the
 > >  > ovl inode can be evicted, no?
 > >  >
 > >
 > > IIUC, since currently ovl does not have it's own page-cache, so there is no affect to page-cache reclaim,
 > > also  there is no ovl shrinker to reclaim slab because we drop ovl inode directly after final iput.
 > > So should we add a shrinker in this series?
 > >
 > 
 > Would that add a lot of complexity?

Sorry, don't need any other shrinker because inode and dentry use common vfs shrinker.

 > Thinking out loud: maybe we follow Jan's suggestion and fix remaining
 > performance with followup series?
 > 

Okay,  so let's leave it as homework.

Thanks,
Chengguang