On 03/15/2017, 01:24 PM, Nikolay Borisov wrote: > From: Brian Foster <bfoster@xxxxxxxxxx> > > The total field from struct xfs_alloc_arg is a bit of an unknown > commodity. It is documented as the total block requirement for the > transaction and is used in this manner from most call sites by virtue of > passing the total block reservation of the transaction associated with > an allocation. Several xfs_bmapi_write() callers pass hardcoded values > of 0 or 1 for the total block requirement, which is a historical oddity > without any clear reasoning. > > The xfs_iomap_write_direct() caller, for example, passes 0 for the total > block requirement. This has been determined to cause problems in the > form of ABBA deadlocks of AGF buffers due to incorrect AG selection in > the block allocator. Specifically, the xfs_alloc_space_available() > function incorrectly selects an AG that doesn't actually have sufficient > space for the allocation. This occurs because the args.total field is 0 > and thus the remaining free space check on the AG doesn't actually > consider the size of the allocation request. This locks the AGF buffer, > the allocation attempt proceeds and ultimately fails (in > xfs_alloc_fix_minleft()), and xfs_alloc_vexent() moves on to the next > AG. In turn, this can lead to incorrect AG locking order (if the > allocator wraps around, attempting to lock AG 0 after acquiring AG N) > and thus deadlock if racing with another operation. This problem has > been reproduced via generic/299 on smallish (1GB) ramdisk test devices. > > To avoid this problem, replace the undocumented hardcoded total > parameters from the iomap and utility callers to pass the block > reservation used for the associated transaction. This is consistent with > other xfs_bmapi_write() callers throughout XFS. The assumption is that > the total field allows the selection of an AG that can handle the entire > operation rather than simply the allocation/range being requested (e.g., > resulting btree splits, etc.). This addresses the aforementioned > generic/299 hang by ensuring AG selection only occurs when the > allocation can be satisfied by the AG. > > Reported-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx> > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx> > Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx> > Signed-off-by: Dave Chinner <david@xxxxxxxxxxxxx> > Signed-off-by: Nikolay Borisov <nborisov@xxxxxxxx> > --- > > Here is the backport for 3.12 Applied, thanks! -- js suse labs -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html