On Wed, Apr 10, 2024 at 08:49:39PM -0700, Christoph Hellwig wrote: > On Wed, Apr 10, 2024 at 06:15:02PM -0700, Darrick J. Wong wrote: > > > This looks a little weird to me. Can't we simply use XFS_IGET_DONTCACHE > > > at iget time and then clear I_DONTCACHE here if we want to keep the > > > inode around? > > > > Not anymore, because other threads can mess around with the dontcache > > state (yay fsdax access path changes!!) while we are scrubbing the > > inode. > > You mean xfs_ioctl_setattr_prepare_dax? Oh lovely, a completely > undocumented d_mark_dontcache in a completely non-obvious place. > > It sems to have appeared in > commit e4f9ba20d3b8c2b86ec71f326882e1a3c4e47953 > Author: Ira Weiny <ira.weiny@xxxxxxxxx> > Date: Thu Apr 30 07:41:38 2020 -0700 > > fs/xfs: Update xfs_ioctl_setattr_dax_invalidate() > > without any explanation either. And I can't see any reason why > we'd prevent inodes and dentries to be cached after DAX mode > switches to start with. I can only guess, maybe the commit thinks > d_mark_dontcache is about data caching? It's the horrible way that fsdax "supports" switching the address ops and i_mapping contents at runtime -- set the ondisk iflag, mark the inode/dentry for immediate explusion, wait for reclaim to eat the inode, then reload it and *presto* new incore iflag and state! (It's gross but I don't know of a better way to drain i_mapping and change address ops and at this point I'm hoping I just plain forget all that pmem stuff. :P) > > > > > Given that we only set the uncached flag from > > > XFS_IGET_DONTCACHE on a cache miss, we won't have set > > > DCACHE_DONTCACHE anywhere (and don't really care about the dentries to > > > start with). > > > > > > But why do we care about keeping the inodes with errors in memory > > > here, but not elsewhere? > > > > We actually, do, but it's not obvious... > > > > > Maybe this can be explained in an expanded comment. > > > > ...because this bit here is basically the same as xchk_irele, but we > > don't have a xfs_scrub object to pass in, so it's opencoded. I could > > pull this logic out into: > > Eww, I hadn't seen xchk_irele before. To me it looks like > I_DONTCACHE/d_mark_dontcache is really the wrong vehicle here. > > I'd instead have a XFS_IGET_SCRUB, which will set an XFS_ISCRUB or > whatever flag on a cache miss. Any cache hit without XFS_IGET_SCRUB > will clear it. > > ->drop_inode then always returns true for XFS_ISCRUB inodes unless > in a transaction. How does it determine that we're in a transaction? We just stopped storing transactions in current->journal_info due to problems with nested transactions and ext4 assuming that it can blind deref that. > Talking about the in transaction part - why do > we drop inodes in the transaction in scrub, but not elsewhere? One example is: Alloc transaction -> lock rmap btree for repairs -> iscan filesystem to find rmap records -> iget/irele. --D