Re: [PATCH] xfs: hold xfs_buf locked between shortform->leaf conversion and the addition of an attribute

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 10, 2017 at 03:09:09PM +0300, Alex Lyakas wrote:
> Hi Dave,
> 
> Thanks for the explanation. So it seems we cannot move forward with this
> fix.
> 

I don't think this completely invalidates the fix.. Dave is pointing out
a flaw that the deferred ops infrastructure doesn't properly handle the
technique we want to use here. IOW, it means there's a dependency that
needs to be implemented first.

FWIW, I also think this means that your approach on the older kernel to
join/hold the buffer to the finished transaction may be the right
approach (depending on whether I follow the perm transaction code
correctly or not, see below), but I think you'd need to relog the buffer
as well.

> Will somebody else in XFS community be working on fixing this issue? As you
> pointed out, it exists for over two decades. Our production systems hit this
> every couple of days, and shutting down the filesystem causes outage.
> 

I'm guessing the defer infrastructure needs to handle relogging a buffer
similar to how it currently handles joining/relogging inodes..?

...
> -----Original Message----- From: Dave Chinner
...
> The problem is that the locked buffer is not joined and logged in
> the rolling transactions run in xfs_defer_ops. Hence it can pin the
> tail of the AIL, and this can prevent the transaction roll from
> regranting the log space necessary to continue rolling the
> transaction for the required number of transactions to complete the
> deferred ops. If this happens, we end up with a log space deadlock.
> 
> Hence if we are holding an item that we logged in a transaction
> locked and we roll the transaction, we have to join, hold and log it
> in each subsequent transaction until we have finished with the item
> and can unlock and release it.
> 
> This is documented in xfs_trans_roll():
> 
>        /*
>         * Reserve space in the log for th next transaction.
>         * This also pushes items in the "AIL", the list of logged items,
>         * out to disk if they are taking up space at the tail of the log
>         * that we want to use.  This requires that either nothing be locked
>         * across this call, or that anything that is locked be logged in
>         * the prior and the next transactions.
>         */
> 

Good catch, though I'm wondering whether it's a real enough problem atm
to block this fix. A few thoughts/questions:

1.) The transaction in this case has a t_log_count of 3, presumably to
cover the commits by the historical bmap_finish, the trans roll and the
final commit. If I'm following the permanent transaction code correctly,
doesn't that mean that we have room for at least 2 rolls (and 3 commits)
before this task would actually block on log reservation? AFAICT it
looks like the commit would dec ticket->t_cnt and replenish the current
log reservation. The subsequent xfs_trans_reserve() would just return if
t_cnt > 0.

This of course doesn't accommodate the fact the xfs_defer_finish() can
now roll a transaction an indeterminate number of times, which probably
needs to be handled in general, but...

2.) It doesn't look like we actually defer any ops in this situation
unless rmapbt is enabled, assuming that we limit holding the buffer to
the empty leaf case, at least (see my comment on the previous version).
I also don't see where a deferred rmapbt update would itself ever roll
the transaction.

3.) The buffer in this case is a new allocation, which I think means the
risk of it pinning the tail and causing a log deadlock here means that
on top of somehow depleting the initial permanent reservation, we'd have
to exhaust the log in the time between the first commit and the last
reservation.

Given the above, it seems reasonably safe enough to me to merge this
change as is and fix up the deferred ops stuff after the fact
(considering we know we need to rework the xattr stuff as such anyways).
This is still a landmine that should be fixed up, but I wouldn't be
against an ASSERT() or something for the time being if we could somehow
verify that the transaction ticket didn't require any extra reservation.

OTOH, just adding deferred ops buffer relogging might not be too much
trouble either. ;) Anyways, thoughts?

Brian

> 
> Cheers,
> 
> Dave.
> 
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux