On Wed, Sep 21, 2016 at 09:06:13AM +1000, Dave Chinner wrote: > On Wed, Sep 21, 2016 at 06:14:53AM +1000, Dave Chinner wrote: > > On Tue, Sep 20, 2016 at 10:33:04AM -0600, Ross Zwisler wrote: > > > I'm consistently able to generate this kernel BUG with both v4.7 and v4.8-rc7. > > > This bug reproduces both with and without DAX. > > > Here is the BUG with v4.8-rc7, passed through kasan_symbolize.py: > > > > > > run fstests generic/026 at 2016-09-20 10:22:58 > > > XFS (pmem0p2): Unmounting Filesystem > > > XFS: Assertion failed: tp->t_blk_res_used <= tp->t_blk_res, file: fs/xfs/xfs_trans.c, line: 309 > > > > It overran the block allocation reservation for the transaction. > > Can you try the patch I've attached below, Ross? it solves the > problem for me.... > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx > > xfs: remote attribute blocks aren't really userdata > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > When adding a new remote attribute, we write the attribute to the > new extent before the allocation transaction is committed. This > means we cannot reuse busy extents as that violates crash > consistency semantics. Hence we currently treat remote attribute > extent allocation like userdata because it has the same overwrite > ordering constraints as userdata. > > Unfortunately, this also allows the allocator to incorrectly apply > extent size hints to the remote attribute extent allocation. This > results in interesting failures, such as transaction block > reservation overruns and in-memory inode attribute fork corruption. > > To fix this, we need to separate the busy extent reuse configuration > from the userdata configuration. This changes the definition of > XFS_BMAPI_METADATA slightly - it now means that allocation is > metadata and reuse of busy extents is acceptible due to the metadata > ordering semantics of the journal. If this flag is not set, it > means the allocation is that has unordered data writeback, and hence > busy extent reuse is not allowed. It no longer implies the > allocation is for user data, just that the data write will not be > strictly ordered. This matches the semantics for both user data > and remote attribute block allocation. > > As such, This patch changes the "userdata" field to a "datatype" > field, and adds a "no busy reuse" flag to the field. > When we detect an unordered data extent allocation, we immediately set > the no reuse flag. We then set the "user data" flags based on the > inode fork we are allocating the extent to. Hence we only set > userdata flags on data fork allocations now and consider attribute > fork remote extents to be an unordered metadata extent. > > The result is that remote attribute extents now have the expected > allocation semantics, and the data fork allocation behaviour is > completely unchanged. > > It should be noted that there may be other ways to fix this (e.g. > use ordered metadata buffers for the remote attribute extent data > write) but they are more invasive and difficult to validate both > from a design and implementation POV. Hence this patch takes the > simple, obvious route to fixing the problem... > > Reported-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx> > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> Yep, this solves it for me as well. Tested-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html