Re: [PATCH 0/3] xfs: sparse inodes overlap end of filesystem

Eric Sandeen <sandeen@xxxxxxxxxxx> · Thu, 31 Oct 2024 15:45:29 -0500

On 10/31/24 6:44 AM, Carlos Maiolino wrote:
> On Tue, Oct 29, 2024 at 11:14:18AM -0500, Eric Sandeen wrote:
>> On 10/23/24 9:51 PM, Dave Chinner wrote:
>>> There is one question that needs to be resolved in this patchset: if
>>> we take patch 2 to allow sparse inodes at the end of the AG, why
>>> would we need the change in patch 1? Indeed, at this point I have to
>>> ask why we even need the min/max agbno guidelines to the inode chunk
>>> allocation as we end up allowing any aligned location in the AG to
>>> be used by sparse inodes. i.e. if we take patch 2, then patch 1 is
>>> unnecessary and now we can remove a bunch of code (min/max_agbno
>>> constraints) from the allocator paths...
>>>
>>> I'd prefer that we take the latter path: ignore the first patch.
>>> This results in more flexible behaviour, allows existing filesystems
>>> with this issue to work without needing xfs_repair to fix them, and
>>> we get to remove complexity from the code.
>>>
>>> Thoughts?
>>
>> For some reason I'm struggling to grasp some of the details here, so
>> maybe I can just throw out a "what I think should happen" type response.
>>
>> A concern is that older xfs_repair binaries will continue to see
>> inodes in this region as corrupt, and throw them away, IIUC - even
>> if the kernel is updated to handle them properly.
>>
>> Older xfs_repair could be encountered on rescue CDs/images, maybe
>> even in initramfs environments, by virt hosts managing guest filesystems,
>> etc.
>>
>> So it seems to me that it would be worth it to prevent any new inode
>> allocations in this region going forward, even if we *can* make it work,
>> so that we won't continue to generate what looks like corruption to older
>> userspace.
>>
>> That might not be the most "pure" upstream approach, but as a practical
>> matter I think it might be a better outcome for users and support
>> orgs... even if distros update kernels & userspace together, that does
>> not necessarily prevent older userspace from encountering a filesystem
>> with inodes in this range and trashing them.
>>
> 
> I'm inclined to agree with Eric here as preventing the sparse inodes to be
> allocated at the edge of the runt AG sounds the most reasonable approach to me.
> 
> It just seems to me yet another corner case to deal with for very little benefit,
> i.e to enable a few extra inodes, on a FS that seems to be in life support
> regarding space for new inodes, whether it's a distro kernel or upstream kernel.
> 
> It kind of seem risky to me, to allow users to run a new kernel, allocate inodes
> there, fill those inodes with data, just to run a not yet ready xfs_repair, and
> discard everything there. Just seems like a possible data loss vector.
> 
> Unless - and I'm not sure how reasonable it is -, we first release a new
> xfsprogs, preventing xfs_repair to rip off those inodes, and later update the
> kernel. But this will end up on users hitting a -EFSCORRUPTED every attempt to
> allocate inodes from the FS edge.
> 
> How feasible would be to first prevent inodes to be allocated at the runt AG's
> edge, let it sink for a while, and once we have a fixed xfs_repair for some
> time, we then enable inode allocation on the edge, giving enough time for users
> to have a newer xfs_repair?
> 
> Again, I'm not sure it it does make sense at all, hopefully it does.

I think Dave agrees with all this too, and I may have simply misunderstood
the proposal.

paraphrasing a side convo w/ Dave, it seems to make sense to have 3 steps for
the fix:

1) Stop allowing inode allocations in these blocks (kernel)
2) Treat already-allocated inodes in these blocks as valid (kernel+userspace)
3) Re-enable inode allocations to these blocks (kernel)

Distros can pick up 1) and 2), and skip 3) if desired to avoid problems with
older userspace if that seems prudent.

I guess I still worry a little about upstream/non-distro use after applying 3)
but it's the technically correct path forward.

-Eric