Re: [PATCH 09/14] xfs: move log shut down handling out of xlog_state_iodone_process_iclog

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 19, 2020 at 07:36:03AM -0400, Brian Foster wrote:
> > True.  I think we just need to clear cycled_icloglock in the
> > shutdown branch.  I prefer that flow over falling through to the
> > main loop body as that clearly separates out the shutdown case.
> > 
> 
> Sure, but a shutdown can still happen at any point so this is just a
> duplicate branch to maintain.

I don't understand.  We are in the inner loop and under l_icloglock.
The next time a shutdown can come in is when
xlog_state_do_iclog_callbacks drops l_icloglock.  That is at the end
of the inner loop, which means we will always go back to the
force shutdown check quickly.  So how is the branch duplicate?  Yes,
it also calls xlog_state_do_iclog_callbacks and does the wakeup,
but in doing that early it avoid a whole lot of complicated logic
in the previous code base.

> I think you're misreading me. I'm not suggesting to fake state changes.
> I'd argue that's actually what the special case shutdown branch does.
> And to the contrary, this patch already implements what I'm suggesting,
> it's just not consistent behavior..

I'm rather confused now.

> First, we basically already go from whatever state we're in to "logical
> CALLBACK" during shutdown. This is just forcibly implemented via the
> IOERROR state. With IOERROR eventually removed, this highlights things
> like whether it's actually safe to make some of those arbitrary
> transitions. It's actually not, because going from WANT_SYNC -> CALLBACK
> is a potential use after free vector of the CIL ctx (as soon as the ctx
> is added to the callback list in the CIL push code). This is yet another
> functional problem that should be fixed before removing IOERROR, IMO
> (and is reproducible via kasan splat, btw). At this point I think some
> of these shutdown checks associated with CALLBACK are simply to ensure
> IOERROR remains persistent once it's set on an iclog. We don't need to
> carry that logic around if IOERROR is going away.

What shutdown check associated with CALLBACK?

> SYNCING -> CALLBACK is another hokey transition in the existing code,
> even if it doesn't currently manifest in a bug that I can see, because
> we should probably still expect (wait for) an I/O completion despite
> that the filesystem had shutdown in the meantime. Fixing that one might
> require tweaks to how the shutdown code actually works (i.e. waiting on
> an I/O vs. running callbacks while in-flight). It's not immediately
> clear to me what the best solution is for that, but I suspect it could
> tie in with fixing the problem noted above.

True, actually running callbacks on various kinds of "in-flight" iclogs
seems rather dangerous.  So should I interpret your above comments
in that we should fix that first before killing of the IOERROR state?



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux