Re: Deadlock between block allocation and block truncation

Nikolay Borisov <n.borisov.lkml@xxxxxxxxx> · Wed, 12 Apr 2017 20:44:32 +0300

On 12.04.2017 19:10, Christoph Hellwig wrote:
> Hi Nikolay,
> 
> I guess the culprit is that truncate can free up to two extents in
> the same transaction and thus try to lock two different AGs without
> requiring them to be in increasing order.

On the other hand Darrick suggested that the problem might be in the
allocation path due to it having a dirty buffer for AGF1 and proceeding
to lock AGF0, resulting in locking order violation. So the bli holding
AGF1 in the allocating task is:

crash> struct xfs_buf_log_item.bli_flags 0xffff8800a60b1570
  bli_flags = 2

That's XFS_BLI_DIRTY. According to Darick's opinion here is what
*should* happen:

"
djwong: either agf1 is clean and it needs to release that before going
for agf0, or agf1 is dirty and thus it cannot go for agf0
"

In this case agf1 is dirty and allocation path continues to agf0 which
is clear lock order violation?

On the truncation side the bli's flags for agf0 :

crash> struct -x xfs_buf_log_item.bli_flags 0xffff8801394ed2b8
  bli_flags = 0xa => BLI_DIRTY | BLI_LOGGED

And then it is proceeding to lock AGF1 (ascending order) correctly.

In spite of this your patch is likely to help this situation though I'm
not sure if it is modifying the right side of the violation.

Regards,
Nikolay
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html