Re: About the conflict between XFS inode recycle and VFS rcu-walk

Dave Chinner <david@xxxxxxxxxxxxx> · Fri, 8 Dec 2023 11:14:32 +1100

On Tue, Dec 05, 2023 at 07:38:33PM +0800, alexjlzheng@xxxxxxxxx wrote:
> Hi, all
> 
> I would like to ask if the conflict between xfs inode recycle and vfs rcu-walk
> which can lead to null pointer references has been resolved?
> 
> I browsed through emails about the following patches and their discussions:
> - https://lore.kernel.org/linux-xfs/20220217172518.3842951-2-bfoster@xxxxxxxxxx/
> - https://lore.kernel.org/linux-xfs/20220121142454.1994916-1-bfoster@xxxxxxxxxx/
> - https://lore.kernel.org/linux-xfs/164180589176.86426.501271559065590169.stgit@xxxxxxxxxxxxxxxxx/
> 
> And then came to the conclusion that this problem has not been solved, am I
> right? Did I miss some patch that could solve this problem?

We fixed the known problems this caused by turning off the VFS
functionality that the rcu pathwalks kept tripping over. See commit
7b7820b83f23 ("xfs: don't expose internal symlink metadata buffers to
the vfs").

Apart from that issue, I'm not aware of any other issues that the
XFS inode recycling directly exposes.

> According to my understanding, the essence of this problem is that XFS reuses
> the inode evicted by VFS, but VFS rcu-walk assumes that this will not happen.

It assumes that the inode will not change identity during the RCU
grace period after the inode has been evicted from cache. We can
safely reinstantiate an evicted inode without waiting for an RCU
grace period as long as it is the same inode with the same content
and same state.

Problems *may* arise when we unlink the inode, then evict it, then a
new file is created and the old slab cache memory address is used
for the new inode. I describe the issue here:

https://lore.kernel.org/linux-xfs/20220118232547.GD59729@xxxxxxxxxxxxxxxxxxx/

That said, we have exactly zero evidence that this is actually a
problem in production systems. We did get systems tripping over the
symlink issue, but there's no evidence that the
unlink->close->open(O_CREAT) issues are manifesting in the wild and
hence there hasn't been any particular urgency to address it.

> Are there any recommended workarounds until an elegant and efficient solution
> can be proposed? After all, causing a crash is extremely unacceptable in a
> production environment.

What crashes are you seeing in your production environment?

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx