On 2020/8/28 8:35, Dave Chinner wrote: > On Thu, Aug 27, 2020 at 05:58:07PM +0800, Li, Hao wrote: >> On 2020/8/27 14:37, Dave Chinner wrote: >>> On Fri, Aug 21, 2020 at 09:59:53AM +0800, Hao Li wrote: >>>> Currently, DCACHE_REFERENCED prevents the dentry with DCACHE_DONTCACHE >>>> set from being killed, so the corresponding inode can't be evicted. If >>>> the DAX policy of an inode is changed, we can't make policy changing >>>> take effects unless dropping caches manually. >>>> >>>> This patch fixes this problem and flushes the inode to disk to prepare >>>> for evicting it. >>>> >>>> Signed-off-by: Hao Li <lihao2018.fnst@xxxxxxxxxxxxxx> >>>> --- >>>> fs/dcache.c | 3 ++- >>>> fs/inode.c | 2 +- >>>> 2 files changed, 3 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/fs/dcache.c b/fs/dcache.c >>>> index ea0485861d93..486c7409dc82 100644 >>>> --- a/fs/dcache.c >>>> +++ b/fs/dcache.c >>>> @@ -796,7 +796,8 @@ static inline bool fast_dput(struct dentry *dentry) >>>> */ >>>> smp_rmb(); >>>> d_flags = READ_ONCE(dentry->d_flags); >>>> - d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED; >>>> + d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED >>>> + | DCACHE_DONTCACHE; >>> Seems reasonable, but you need to update the comment above as to >>> how this flag fits into this code.... >> Yes. I will change it. Thanks. >> >>>> /* Nothing to do? Dropping the reference was all we needed? */ >>>> if (d_flags == (DCACHE_REFERENCED | DCACHE_LRU_LIST) && !d_unhashed(dentry)) >>>> diff --git a/fs/inode.c b/fs/inode.c >>>> index 72c4c347afb7..5218a8aebd7f 100644 >>>> --- a/fs/inode.c >>>> +++ b/fs/inode.c >>>> @@ -1632,7 +1632,7 @@ static void iput_final(struct inode *inode) >>>> } >>>> >>>> state = inode->i_state; >>>> - if (!drop) { >>>> + if (!drop || (drop && (inode->i_state & I_DONTCACHE))) { >>>> WRITE_ONCE(inode->i_state, state | I_WILL_FREE); >>>> spin_unlock(&inode->i_lock); >>> What's this supposed to do? We'll only get here with drop set if the >>> filesystem is mounting or unmounting. >> The variable drop will also be set to True if I_DONTCACHE is set on >> inode->i_state. >> Although mounting/unmounting will set the drop variable, it won't set >> I_DONTCACHE if I understand correctly. As a result, >> drop && (inode->i_state & I_DONTCACHE) will filter out mounting/unmounting. > So what does this have to do with changing the way the dcache > treats DCACHE_DONTCACHE? After changing the way the dcache treats DCACHE_DONTCACHE, the dentry with DCACHE_DONTCACHE set will be killed unconditionally even if DCACHE_REFERENCED is set, and its inode will be processed by iput_final(). This inode has I_DONTCACHE flag, so op->drop_inode() will return true, and the inode will be evicted _directly_ even though it has dirty pages. I think the kernel will run into error state because it doesn't writeback dirty pages of this inode before evicting. This is why I write back this inode here. According to my test, if I revert the second hunk of this patch, kernel will hang after running the following command: echo 123 > test.txt && xfs_io -c "chattr +x" test.txt The backtrace is: xfs_fs_destroy_inode+0x204/0x24d destroy_inode+0x3b/0x65 evict+0x150/0x1aa iput+0x117/0x19a dentry_unlink_inode+0x12b/0x12f __dentry_kill+0xee/0x211 dentry_kill+0x112/0x22f dput+0x79/0x86 __fput+0x200/0x23f ____fput+0x25/0x28 task_work_run+0x144/0x177 do_exit+0x4fb/0xb94 This backtrace is printed with an ASSERT(0) statement in xfs_fs_destroy_inode(). > > Also, if I_DONTCACHE is set, but the inode has also been unlinked or > is unhashed, should we be writing it back? i.e. it might have been > dropped for a different reason to I_DONTCACHE.... This is a problem I didn't realize. You are right. If the inode has been unlinked/unhashed and the I_DONTCACHE is also set, the appended condition will lead to unnecessary writeback. I think I need to handle the inode writeback more carefully. Maybe I can revert the second hunk and remove I_DONTCACHE from generic_drop_inode() and then change if (!drop && (sb->s_flags & SB_ACTIVE)) to if (!drop && !(inode->i_state & I_DONTCACHE) && (sb->s_flags & SB_ACTIVE)) This approach would be more suitable. > > IOWs, if there is a problem with how I_DONTCACHE is being handled, > then the problem must already exists regardless of the > DCACHE_DONTCACHE behaviour, right? So shouldn't this be a separate > bug fix with it's own explanation of the problem and the fix? I do think the way we treat I_DONTCACHE in current kernel is not suitable. generic_drop_inode() is used to determine if the inode should be evicted without writeback, so I_DONTCACHE shouldn't be handled in this function. The suitable approach is illustrated above. I can submit a patch to fix this problem if this new approach is acceptable. Thanks, Hao Li > Cheers, > > Dave.