Re: [PATCH 6/7] xfs: don't update file system geometry through transaction deltas

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 11, 2024 at 09:57:09AM +0200, Christoph Hellwig wrote:
> On Thu, Oct 10, 2024 at 10:05:53AM -0400, Brian Foster wrote:
> > Ok, so we don't want geometry changes transactions in the same CIL
> > checkpoint as alloc related transactions that might depend on the
> > geometry changes. That seems reasonable and on a first pass I have an
> > idea of what this is doing, but the description is kind of vague.
> > Obviously this fixes an issue on the recovery side (since I've tested
> > it), but it's not quite clear to me from the description and/or logic
> > changes how that issue manifests.
> > 
> > Could you elaborate please? For example, is this some kind of race
> > situation between an allocation request and a growfs transaction, where
> > the former perhaps sees a newly added AG between the time the growfs
> > transaction commits (applying the sb deltas) and it actually commits to
> > the log due to being a sync transaction, thus allowing an alloc on a new
> > AG into the same checkpoint that adds the AG?
> 
> This is based on the feedback by Dave on the previous version:
> 
> https://lore.kernel.org/linux-xfs/Zut51Ftv%2F46Oj386@xxxxxxxxxxxxxxxxxxx/
> 

Ah, Ok. That all seems reasonably sane to me on a first pass, but I'm
not sure I'd go straight to this change given the situation...

> Just doing the perag/in-core sb updates earlier fixes all the issues
> with my test case, so I'm not actually sure how to get more updates
> into the check checkpoint.  I'll try your exercisers if it could hit
> that.
> 

Ok, that explains things a bit. My observation is that the first 5
patches or so address the mount failure problem, but from there I'm not
reproducing much difference with or without the final patch. Either way,
I see aborts and splats all over the place, which implies at minimum
this isn't the only issue here.

So given that 1. growfs recovery seems pretty much broken, 2. this
particular patch has no straightforward way to test that it fixes
something and at the same time doesn't break anything else, and 3. we do
have at least one fairly straightforward growfs/recovery test in the
works that reliably explodes, personally I'd suggest to split this work
off into separate series.

It seems reasonable enough to me to get patches 1-5 in asap once they're
fully cleaned up, and then leave the next two as part of a followon
series pending further investigation into these other issues. As part of
that I'd like to know whether the recovery test reproduces (or can be
made to reproduce) the issue this patch presumably fixes, but I'd also
settle for "the grow recovery test now passes reliably and this doesn't
regress it." But once again, just my .02.

Brian





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux