Re: [PATCH 28/30] xfs: rework xfs_iflush_cluster() dirty inode iteration

Dave Chinner <david@xxxxxxxxxxxxx> · Mon, 15 Jun 2020 11:01:52 +1000

On Thu, Jun 11, 2020 at 09:56:18AM -0400, Brian Foster wrote:
> On Thu, Jun 11, 2020 at 09:40:08AM +1000, Dave Chinner wrote:
> > On Wed, Jun 10, 2020 at 09:06:28AM -0400, Brian Foster wrote:
> > > On Wed, Jun 10, 2020 at 08:01:39AM +1000, Dave Chinner wrote:
> > > > On Tue, Jun 09, 2020 at 09:11:55AM -0400, Brian Foster wrote:
> > > > > On Thu, Jun 04, 2020 at 05:46:04PM +1000, Dave Chinner wrote:
> > > > > > -		 * check is not sufficient.
> > > > > > +		 * If we are shut down, unpin and abort the inode now as there
> > > > > > +		 * is no point in flushing it to the buffer just to get an IO
> > > > > > +		 * completion to abort the buffer and remove it from the AIL.
> > > > > >  		 */
> > > > > > -		if (!cip->i_ino) {
> > > > > > -			xfs_ifunlock(cip);
> > > > > > -			xfs_iunlock(cip, XFS_ILOCK_SHARED);
> > > > > > +		if (XFS_FORCED_SHUTDOWN(mp)) {
> > > > > > +			xfs_iunpin_wait(ip);
> > > > > 
> > > > > Note that we have an unlocked check above that skips pinned inodes.
> > > > 
> > > > Right, but we could be racing with a transaction commit that pinned
> > > > the inode and a shutdown. As the comment says: it's a quick and
> > > > dirty check to avoid trying to get locks when we know that it is
> > > > unlikely we can flush the inode without blocking. We still have to
> > > > recheck that state once we have the ILOCK....
> > > > 
> > > 
> > > Right, but that means we can just as easily skip the shutdown processing
> > > (which waits for unpin) if a particular inode is pinned. So which is
> > > supposed to happen in the shutdown case?
> > >
> > > ISTM that either could happen. As a result this kind of looks like
> > > random logic to me.
> > 
> > Yes, shutdown is racy, so it could be either. However, I'm not
> > changing the shutdown logic or handling here. If the shutdown race
> > could happen before this patchset (and it can), it can still happen
> > after the patchset, and this patchset does not change the way we
> > handle the shutdown race at all.
> > 
> > IOWs, while this shutdown logic may appear "random", that's not a
> > result of this patchset - it a result of design decisions made in
> > the last major shutdown rework/cleanup that required checks to be
> > added to places that could hang waiting for an event that would
> > never occur because shutdown state prevented it from occurring.
> 
> It's not so much the shutdown check that I find random as much as how it
> intends to handle pinned inodes.

I can change that, but that's the shutdown was handled by similar
code in the past, so it seemed appropriate here because this code
was hanging on shutdowns during development.

> > There's already enough complexity in this patchset that adding
> > shutdown logic changes is just too much to ask for.  If we want to
> > change how various shutdown logics work, lets do it as a separate
> > set of changes so all the subtle bugs that result from the changes
> > bisect to the isolated shutdown logic changes...
> > 
> 
> The fact that shutdown is racy is just background context. My point is
> that this patch appears to introduce special shutdown handling for a
> condition where it 1.) didn't previously exist and 2.) doesn't appear to
> be necessary.

The random shutdown failures I kept seeing in shutdown tests says
otherwise.

> The current push/flush code only incorporates a shutdown check
> indirectly via mapping the buffer, which simulates an I/O failure and
> causes us to abort the flush (and shutdown if the I/O failure happened
> for some other reason). If the shutdown happened sometime after we
> acquired the buffer, then there's no real impact on this code path. We
> flush the inode(s) and return success. The shutdown will be handled
> appropriately when xfsaild attempts to submit the buffer.

The long-standing rule of thumb is "shutdown == abort in-progress IO
immediately", which is what I followed here when it became apparent
there was some kind of subtle shutdown issue occurring. That made
the shutdown problem go away.

It may be that changes I've been making to other parts of this
writeback code make the shutdown check here unnecessary. My testing
said otherwise, but maybe that's all been cleared up. Hence if the
shutdown check is truly unnecessary, let's clean it up in a future
patchset where that assertion can be bisected down cleanly.  I
needed this to get fstests to pass, and for this patchset which is
entirely unrelated to shutdown architecture, that's all the
justification that should be necessary.

> The new code no longer maps the buffer because that is done much
> earlier, but for some reason incorporates a new check to abort the flush
> if the fs is already shutdown. The problem I have with this is that
> these checks tend to be brittle, untested and a maintenance burden.

Shutdown has *always* been brittle. You've seen it go from being
completely fucking awful to actually being somewhat reliable because
your experience with XFS matches to roughly when we first added
substantial shutdown validation to fstests.  We had a huge mountain
of technical debt around shutdown, but the importance of addressing
it has historically been pretty low because -shutdown is extremely
rare- in production systems.

> As
> such, I don't think we should ever add new shutdown checks for cases
> that aren't required for functional correctness.

I think the burden of proof is the wrong way around for the current
shutdown architecture. If you want to make a rule like this, you
need to fix define how the shutdown architecture is going to allow
this sort of rule to be applied without placing unreasonable demands
on the patch author.

> That way we hopefully
> move to a state where we have the minimum number of shutdown checks with
> broadest coverage to ensure everything unwinds correctly, but don't have
> to constantly battle with insufficiently tested logic in obscure
> contexts that silently break as surrounding code changes over time and
> leads to random fstests hangs and shutdown logic cleanouts every couple
> of years.

Yes, I think this is an admirable goal, but I think you've got how
to get there completely backward.  First work out the architecture
and logic that allows us to remove/avoid "randomly" placed checks,
then work towards cleaning up the code. We don't get there by saying
"no new checks!" and then ignoring the underlying problems those
checks are trying to avoid.

If you can come up with a shutdown mechanism that:

	a) prevents all IO the moment a shutdown is triggered, and
	b) avoids shutdown detection ordering hangs between
	   different objects and subsystems,

then you have grounds for saying "nobody should need to add new
shutdown checks". But right now, that's not even close to being the
reality. There needs to be lots more cleanup and rework to the
shutdown code to be done before we get anywhere near this...

> So my question for any newly added shutdown check is: what problem does
> this check solve?

Repeated fstests hangs on various shutdown tests while developing
this code.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx