On 10/31/24 6:44 AM, Carlos Maiolino wrote: > On Tue, Oct 29, 2024 at 11:14:18AM -0500, Eric Sandeen wrote: >> On 10/23/24 9:51 PM, Dave Chinner wrote: >>> There is one question that needs to be resolved in this patchset: if >>> we take patch 2 to allow sparse inodes at the end of the AG, why >>> would we need the change in patch 1? Indeed, at this point I have to >>> ask why we even need the min/max agbno guidelines to the inode chunk >>> allocation as we end up allowing any aligned location in the AG to >>> be used by sparse inodes. i.e. if we take patch 2, then patch 1 is >>> unnecessary and now we can remove a bunch of code (min/max_agbno >>> constraints) from the allocator paths... >>> >>> I'd prefer that we take the latter path: ignore the first patch. >>> This results in more flexible behaviour, allows existing filesystems >>> with this issue to work without needing xfs_repair to fix them, and >>> we get to remove complexity from the code. >>> >>> Thoughts? >> >> For some reason I'm struggling to grasp some of the details here, so >> maybe I can just throw out a "what I think should happen" type response. >> >> A concern is that older xfs_repair binaries will continue to see >> inodes in this region as corrupt, and throw them away, IIUC - even >> if the kernel is updated to handle them properly. >> >> Older xfs_repair could be encountered on rescue CDs/images, maybe >> even in initramfs environments, by virt hosts managing guest filesystems, >> etc. >> >> So it seems to me that it would be worth it to prevent any new inode >> allocations in this region going forward, even if we *can* make it work, >> so that we won't continue to generate what looks like corruption to older >> userspace. >> >> That might not be the most "pure" upstream approach, but as a practical >> matter I think it might be a better outcome for users and support >> orgs... even if distros update kernels & userspace together, that does >> not necessarily prevent older userspace from encountering a filesystem >> with inodes in this range and trashing them. >> > > I'm inclined to agree with Eric here as preventing the sparse inodes to be > allocated at the edge of the runt AG sounds the most reasonable approach to me. > > It just seems to me yet another corner case to deal with for very little benefit, > i.e to enable a few extra inodes, on a FS that seems to be in life support > regarding space for new inodes, whether it's a distro kernel or upstream kernel. > > It kind of seem risky to me, to allow users to run a new kernel, allocate inodes > there, fill those inodes with data, just to run a not yet ready xfs_repair, and > discard everything there. Just seems like a possible data loss vector. > > Unless - and I'm not sure how reasonable it is -, we first release a new > xfsprogs, preventing xfs_repair to rip off those inodes, and later update the > kernel. But this will end up on users hitting a -EFSCORRUPTED every attempt to > allocate inodes from the FS edge. > > How feasible would be to first prevent inodes to be allocated at the runt AG's > edge, let it sink for a while, and once we have a fixed xfs_repair for some > time, we then enable inode allocation on the edge, giving enough time for users > to have a newer xfs_repair? > > Again, I'm not sure it it does make sense at all, hopefully it does. I think Dave agrees with all this too, and I may have simply misunderstood the proposal. paraphrasing a side convo w/ Dave, it seems to make sense to have 3 steps for the fix: 1) Stop allowing inode allocations in these blocks (kernel) 2) Treat already-allocated inodes in these blocks as valid (kernel+userspace) 3) Re-enable inode allocations to these blocks (kernel) Distros can pick up 1) and 2), and skip 3) if desired to avoid problems with older userspace if that seems prudent. I guess I still worry a little about upstream/non-distro use after applying 3) but it's the technically correct path forward. -Eric