On Fri, Aug 28, 2020 at 05:04:14PM +0800, Li, Hao wrote: > On 2020/8/28 8:35, Dave Chinner wrote: > > On Thu, Aug 27, 2020 at 05:58:07PM +0800, Li, Hao wrote: > >> On 2020/8/27 14:37, Dave Chinner wrote: > >>> On Fri, Aug 21, 2020 at 09:59:53AM +0800, Hao Li wrote: > >>>> Currently, DCACHE_REFERENCED prevents the dentry with DCACHE_DONTCACHE > >>>> set from being killed, so the corresponding inode can't be evicted. If > >>>> the DAX policy of an inode is changed, we can't make policy changing > >>>> take effects unless dropping caches manually. > >>>> > >>>> This patch fixes this problem and flushes the inode to disk to prepare > >>>> for evicting it. > >>>> > >>>> Signed-off-by: Hao Li <lihao2018.fnst@xxxxxxxxxxxxxx> > >>>> --- > >>>> fs/dcache.c | 3 ++- > >>>> fs/inode.c | 2 +- > >>>> 2 files changed, 3 insertions(+), 2 deletions(-) > >>>> > >>>> diff --git a/fs/dcache.c b/fs/dcache.c > >>>> index ea0485861d93..486c7409dc82 100644 > >>>> --- a/fs/dcache.c > >>>> +++ b/fs/dcache.c > >>>> @@ -796,7 +796,8 @@ static inline bool fast_dput(struct dentry *dentry) > >>>> */ > >>>> smp_rmb(); > >>>> d_flags = READ_ONCE(dentry->d_flags); > >>>> - d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED; > >>>> + d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED > >>>> + | DCACHE_DONTCACHE; > >>> Seems reasonable, but you need to update the comment above as to > >>> how this flag fits into this code.... > >> Yes. I will change it. Thanks. > >> > >>>> /* Nothing to do? Dropping the reference was all we needed? */ > >>>> if (d_flags == (DCACHE_REFERENCED | DCACHE_LRU_LIST) && !d_unhashed(dentry)) > >>>> diff --git a/fs/inode.c b/fs/inode.c > >>>> index 72c4c347afb7..5218a8aebd7f 100644 > >>>> --- a/fs/inode.c > >>>> +++ b/fs/inode.c > >>>> @@ -1632,7 +1632,7 @@ static void iput_final(struct inode *inode) > >>>> } > >>>> > >>>> state = inode->i_state; > >>>> - if (!drop) { > >>>> + if (!drop || (drop && (inode->i_state & I_DONTCACHE))) { > >>>> WRITE_ONCE(inode->i_state, state | I_WILL_FREE); > >>>> spin_unlock(&inode->i_lock); > >>> What's this supposed to do? We'll only get here with drop set if the > >>> filesystem is mounting or unmounting. > >> The variable drop will also be set to True if I_DONTCACHE is set on > >> inode->i_state. > >> Although mounting/unmounting will set the drop variable, it won't set > >> I_DONTCACHE if I understand correctly. As a result, > >> drop && (inode->i_state & I_DONTCACHE) will filter out mounting/unmounting. > > So what does this have to do with changing the way the dcache > > treats DCACHE_DONTCACHE? > After changing the way the dcache treats DCACHE_DONTCACHE, the dentry with > DCACHE_DONTCACHE set will be killed unconditionally even if > DCACHE_REFERENCED is set, and its inode will be processed by iput_final(). > This inode has I_DONTCACHE flag, so op->drop_inode() will return true, > and the inode will be evicted _directly_ even though it has dirty pages. Yes. But this doesn't rely on DCACHE_DONTCACHE behaviour. Inodes that are looked up and cached by the filesystem without going through dentry cache pathwalks can have I_DONTCACHE set and then get evicted... i.e. we can get I_DONTCACHE set on inodes that do not have dentries attached to them. That's the original functionality that got pulled up from XFS - internal iteration of inodes, either via quotacheck or bulkstat.... > I think the kernel will run into error state because it doesn't writeback > dirty pages of this inode before evicting. This is why I write back this > inode here. > > According to my test, if I revert the second hunk of this patch, kernel > will hang after running the following command: > echo 123 > test.txt && xfs_io -c "chattr +x" test.txt > > The backtrace is: > > xfs_fs_destroy_inode+0x204/0x24d > destroy_inode+0x3b/0x65 > evict+0x150/0x1aa > iput+0x117/0x19a > dentry_unlink_inode+0x12b/0x12f > __dentry_kill+0xee/0x211 > dentry_kill+0x112/0x22f > dput+0x79/0x86 > __fput+0x200/0x23f > ____fput+0x25/0x28 > task_work_run+0x144/0x177 > do_exit+0x4fb/0xb94 > > This backtrace is printed with an ASSERT(0) statement in xfs_fs_destroy_inode(). Sure, that's your messenger. That doesn't mean the bug can't be triggered by other means. > > Also, if I_DONTCACHE is set, but the inode has also been unlinked or > > is unhashed, should we be writing it back? i.e. it might have been > > dropped for a different reason to I_DONTCACHE.... > This is a problem I didn't realize. You are right. If the inode has been > unlinked/unhashed and the I_DONTCACHE is also set, the appended condition > will lead to unnecessary writeback. > > I think I need to handle the inode writeback more carefully. Maybe I can > revert the second hunk and remove I_DONTCACHE from generic_drop_inode() > and then change > > if (!drop && (sb->s_flags & SB_ACTIVE)) > > to > > if (!drop && !(inode->i_state & I_DONTCACHE) && (sb->s_flags & SB_ACTIVE)) > > This approach would be more suitable. Yup, that's pretty much what I was thinking, but as a standalone patch and preceding the DCACHE_DONTCACHE behaviour change. Thanks! :) Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx