> > > > When an inode is writably mapped via ovarlayfs, you can flag the > > > > overlay inode with "maybe-writably-mapped" and then remove > > > > it from the maybe dirty list when the underlying inode is not dirty > > > > AND its i_writecount is 0 (checked on write_inode() and release()). > > > > > > > > Actually, there is no reason to treat writably mapped inodes and > > > > other dirty inodes differently - insert to suspect list on open for > > > > write, remove from suspect list on last release() or write_inode() > > > > when inode is no longer dirty and writable. > > I have to say inserting to suspect list cannot prevent dropping, > thinking of the problem of previous approach that we write dirty upper > inode with current->flags & PF_MEMALLOC while evicting clean overlay inode. > Sorry, I don't understand what that means. > > > > > > > > > Did I miss anything? > > > > > > > > > > If we dirty overlay inode that means we have launched writeback mechanism, > > > so in this case, re-dirty overlay inode in time becomes important. > > > > > > > My idea was to use the first call to ovl_sync_fs() with 'wait' false > > to iterate the > > maybe-dirty list and re-dirty overlay inodes whose upper is dirty. > > > > I'm curious how we prevent dropping of clean overlay inode with dirty upper? > Hold another reference during iput_final operation? in the drop_inode() or something > else? No, just return 0 from ovl_drop_inode() and iput_final() will not evict(). > > > > Then in the second call to __sync_filesystem, sync_inodes_sb() will take > > care of calling ovl_write_inode() for all the re-dirty inodes. > > > > In current code we sync ALL dirty upper fs inodes and we do not overlay > > inodes with no reference in cache. > > > > The best code would sync only upper fs inodes dirtied by this overlay > > and will be able to evict overlay inodes whose upper inodes are clean. > > > > The compromise code would sync only upper fs inodes dirtied by this overlay, > > and would not evict overlay inodes as long as upper inodes are "open for write". > > I think its a fine compromise considering the alternatives. > > > > Is this workable? > > > > In your approach, the key point is how to prevent dropping overlay inode that has > dirty upper and no reference but I don't understand well how to achieve it from > your descriptions. > > Very well, I will try to explain with code: int ovl_inode_is_open_for_write(struct inode *inode) { struct inode *upper_inode = ovl_inode_upper(inode); return upper_inode && inode_is_open_for_write(upper_inode); } void ovl_maybe_mark_inode_dirty(struct inode *inode) { struct inode *upper_inode = ovl_inode_upper(inode); if (upper_inode && upper_inode->i_state & I_DIRTY_ALL) mark_inode_dirty(inode); } int ovl_open(struct inode *inode, struct file *file) { ... if (ovl_inode_is_open_for_write(file_inode(file))) ovl_add_inode_to_suspect_list(inode); file->private_data = realfile; return 0; } int ovl_release(struct inode *inode, struct file *file) { struct inode *inode = file_inode(file); if (ovl_inode_is_open_for_write(inode)) { ovl_maybe_mark_inode_dirty(inode); ovl_remove_inode_from_suspect_list(inode); } fput(file->private_data); return 0; } int ovl_drop_inode(struct inode *inode) { struct inode *upper_inode = ovl_inode_upper(inode); if (!upper_inode) return 1; if (upper_inode->i_state & I_DIRTY_ALL) return 0; return !inode_is_open_for_write(upper_inode); } static int ovl_sync_fs(struct super_block *sb, int wait) { struct ovl_fs *ofs = sb->s_fs_info; struct super_block *upper_sb; int ret; if (!ovl_upper_mnt(ofs)) return 0; /* * Not called for sync(2) call or an emergency sync (SB_I_SKIP_SYNC). * All the super blocks will be iterated, including upper_sb. * * If this is a syncfs(2) call, then we do need to call * sync_filesystem() on upper_sb, but enough if we do it when being * called with wait == 1. */ if (!wait) { /* mark inodes on the suspect list dirty if thier upper inode is dirty */ ovl_mark_suspect_list_inodes_dirty(); return 0; } ... The races are avoided because inode is added/removed from suspect list while overlay inode has a reference (from file) and because upper inode cannot be dirtied by overlayfs when overlay inode is not on the suspect list. Unless I am missing something. Thanks, Amir.