We've had reports of soft lockup warnings in the iomap ioend completion path due to very large bios and/or bio chains. This occurs because ioend completion touches every page associated with the ioend. It generally requires exceedingly large (i.e. multi-GB) bios or bio chains to reproduce a soft lockup warning, but even with smaller ioends there's really no good reason to incur the cost of potential cacheline misses in bio completion context. Divert ioends larger than 1MB to the workqueue so completion occurs in non-atomic context and can reschedule to avoid soft lockup warnings. Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx> --- fs/xfs/xfs_aops.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 84cd6cf46b12..05b1bb146f17 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -30,6 +30,13 @@ XFS_WPC(struct iomap_writepage_ctx *ctx) return container_of(ctx, struct xfs_writepage_ctx, ctx); } +/* + * Completion touches every page associated with the ioend. Send anything + * larger than 1MB (based on 4k pages) or so to the completion workqueue to + * avoid this work in bio completion context. + */ +#define XFS_LARGE_IOEND (256ULL << PAGE_SHIFT) + /* * Fast and loose check if this write could update the on-disk inode size. */ @@ -409,9 +416,14 @@ xfs_prepare_ioend( memalloc_nofs_restore(nofs_flag); - /* send ioends that might require a transaction to the completion wq */ + /* + * Send ioends that might require a transaction or are large enough that + * we don't want to do page processing in bio completion context to the + * wq. + */ if (xfs_ioend_is_append(ioend) || ioend->io_type == IOMAP_UNWRITTEN || - (ioend->io_flags & IOMAP_F_SHARED)) + (ioend->io_flags & IOMAP_F_SHARED) || + ioend->io_size >= XFS_LARGE_IOEND) ioend->io_bio->bi_end_io = xfs_end_bio; return status; } -- 2.26.3