Re: [RFC PATCH v2 5/8] ovl: mark overlayfs' inode dirty on shared writable mmap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



 ---- 在 星期四, 2020-11-05 23:54:34 Jan Kara <jack@xxxxxxx> 撰写 ----
 > On Thu 05-11-20 16:21:27, Amir Goldstein wrote:
 > > On Thu, Nov 5, 2020 at 4:03 PM Jan Kara <jack@xxxxxxx> wrote:
 > > >
 > > > On Wed 04-11-20 19:54:03, Chengguang Xu wrote:
 > > > >  ---- 在 星期二, 2020-11-03 01:30:52 Jan Kara <jack@xxxxxxx> 撰写 ----
 > > > >  > On Sun 25-10-20 11:41:14, Chengguang Xu wrote:
 > > > >  > > Overlayfs cannot be notified when mmapped area gets dirty,
 > > > >  > > so we need to proactively mark inode dirty in ->mmap operation.
 > > > >  > >
 > > > >  > > Signed-off-by: Chengguang Xu <cgxu519@xxxxxxxxxxxx>
 > > > >  > > ---
 > > > >  > >  fs/overlayfs/file.c | 4 ++++
 > > > >  > >  1 file changed, 4 insertions(+)
 > > > >  > >
 > > > >  > > diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
 > > > >  > > index efccb7c1f9bc..cd6fcdfd81a9 100644
 > > > >  > > --- a/fs/overlayfs/file.c
 > > > >  > > +++ b/fs/overlayfs/file.c
 > > > >  > > @@ -486,6 +486,10 @@ static int ovl_mmap(struct file *file, struct vm_area_struct *vma)
 > > > >  > >          /* Drop reference count from new vm_file value */
 > > > >  > >          fput(realfile);
 > > > >  > >      } else {
 > > > >  > > +        if (vma->vm_flags & (VM_SHARED|VM_MAYSHARE) &&
 > > > >  > > +            vma->vm_flags & (VM_WRITE|VM_MAYWRITE))
 > > > >  > > +            ovl_mark_inode_dirty(file_inode(file));
 > > > >  > > +
 > > > >  >
 > > > >  > But does this work reliably? I mean once writeback runs, your inode (as
 > > > >  > well as upper inode) is cleaned. Then a page fault comes so file has dirty
 > > > >  > pages again and would need flushing but overlayfs inode stays clean? Am I
 > > > >  > missing something?
 > > > >  >
 > > > >
 > > > > Yeah, this is key point of this approach, in order to  fix the issue I
 > > > > explicitly set I_DIRTY_SYNC flag in ovl_mark_inode_dirty(), so what i
 > > > > mean is during writeback we will call into ->write_inode() by this
 > > > > flag(I_DIRTY_SYNC) and at that place we get chance to check mapping and
 > > > > re-dirty overlay's inode. The code logic like below in ovl_write_inode().
 > > > >
 > > > >     if (mapping_writably_mapped(upper->i_mapping) ||
 > > > >          mapping_tagged(upper->i_mapping, PAGECACHE_TAG_WRITEBACK))
 > > > >                  iflag |= I_DIRTY_PAGES;
 > > >
 > > > OK, but suppose the upper mapping is clean at this moment (upper inode has
 > > > been fully written out for whatever reason, but it is still mapped) so your
 > > > overlayfs inode becomes clean as well. Then I don't see a mechanism which
 > > > would make your overlayfs inode dirty again when a write to mmap happens,
 > > > set_page_dirty() will end up marking upper inode with I_DIRTY_PAGES flag.
 > > >
 > > > Note that ovl_mmap() gets called only at mmap(2) syscall time but then
 > > > pages get faulted in, dirtied, cleaned fully at discretion of the mm
 > > > / writeback subsystem.
 > > >
 > > 
 > > Perhaps I will add some background.
 > > 
 > > What I suggested was to maintain a "suspect list" in addition to
 > > the dirty ovl inodes.
 > > 
 > > ovl inode is added to the suspect list on mmap (writable) and removed
 > > from the suspect list on release() flush() or on sync_fs() if real inode is no
 > > longer writably mapped.
 > > 
 > > There was another variant where ovl inode is added to suspect list on open
 > > for write and removed from suspect list on release() flush() or sync_fs()
 > > if real inode is not inode_is_open_for_write().
 > > 
 > > In both cases the list will have inodes whose real is not dirty, but
 > > in both cases
 > > the list shouldn't be terribly large to traverse on sync_fs().
 > > 
 > > Chengguang tried to implement the idea without an actual list by
 > > re-dirtying the "suspect" inodes on every write_inode(), but I personally have
 > > no idea if his idea works.
 > > 
 > > I think we can resort to using an actual suspect list if you say that it
 > > cannot work like this?
 > 
 > Yeah, the suspect list (i.e., additional list of inodes to check on sync)
 > you describe should work fine. 

I think this solution still has the problem we have met in below thread[1]
The main problem is the state combination of clean overlayfs' inode && dirty upper inode.
 
[1] https://www.spinics.net/lists/linux-unionfs/msg07448.html

 > Also the "keep suspect inode dirty" idea
 > of Chengguang could work fine but we'd have to use something like
 > inode_is_open_for_write() or inode_is_writeably_mapped() (which would need
 > to be implemented but it should be easy vma_interval_tree_foreach() walk
 > checking each found VMA for vma->vm_flags & VM_WRITE) for checking whether
 > inode should be redirtied or not.
 > 

I'm curious that isn't  it enough to check  i_mmap_writable by mapping_writably_mapped() ?
Am I missing something?


Thanks,
Chengguang




[Index of Archives]     [Linux Filesystems Devel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux