Re: KASAN: use-after-free Read in path_lookupat

Al Viro <viro@xxxxxxxxxxxxxxxxxx> · Mon, 25 Mar 2019 23:02:11 +0000



On Tue, Mar 26, 2019 at 09:48:23AM +1100, Dave Chinner wrote:

> And when it comes to VFS inode reclaim, XFS does not implement
> ->evict_inode because there is nothing at the VFS level to do.
> And ->destroy_inode ends up doing cleanup work (e.g. freeing on-disk
> inodes) which is non-trivial, blocking work, but then still requires
> the struct xfs_inode to be written back to disk before it can bei
> freed. So it just gets marked "reclaimable" and background reclaim
> then takes care of it from there so we avoid synchronous IO in inode
> reclaim...
> 
> This works because don't track dirty inode metadata in the VFS
> writeback code (it's tracked with much more precision in the XFS log
> infrastructure) and we don't write back inodes from the VFS
> infrastructure, either. It's all done based on internal state
> outside the VFS.
> 
> And, because of this, the VFS cannot assume that it can free
> the struct inode after calling ->destroy_inode or even use
> call_rcu() to run a filesystem destructor because the filesystem
> may need to do work that needs to block and that's not allowed in an
> RCU callback...

In Linus' patch that's what you get with non-NULL ->destroy_inode
+ NULL ->destroy_inode_rcu, so XFS won't be screwed by that.
Said that, yes, XFS adds another fun twist there (AFAICS, it's
the only in-tree filesystem that pulls that off).

I would really like some comments from f2fs and ocfs2 folks, as well
as Jan - he's had much more recent contact with writeback code than
I have...  Could somebody explain what's going on in f2fs and ocfs2
->drop_inode()?  It _should_ be just a predicate; looks like both
are playing very odd games to work around writeback problems and
I wonder if there's a cleaner solution for that.  I can try and dig
through maillist(s) archives, but I would really appreciate it
if somebody could give a braindump on the issues dealt with in there...