Re: [PATCH 37/45] drivers: use req op accessor

Ross Zwisler <zwisler@xxxxxxxxx> · Wed, 3 Aug 2016 16:33:15 -0600

On Sun, Jun 5, 2016 at 1:32 PM,  <mchristi@xxxxxxxxxx> wrote:
> From: Mike Christie <mchristi@xxxxxxxxxx>
>
> The req operation REQ_OP is separated from the rq_flag_bits
> definition. This converts the block layer drivers to
> use req_op to get the op from the request struct.
>
> Signed-off-by: Mike Christie <mchristi@xxxxxxxxxx>
> ---
>  drivers/block/loop.c              |  6 +++---
>  drivers/block/mtip32xx/mtip32xx.c |  2 +-
>  drivers/block/nbd.c               |  2 +-
>  drivers/block/rbd.c               |  4 ++--
>  drivers/block/xen-blkfront.c      |  8 +++++---
>  drivers/ide/ide-floppy.c          |  2 +-
>  drivers/md/dm.c                   |  2 +-
>  drivers/mmc/card/block.c          |  7 +++----
>  drivers/mmc/card/queue.c          |  6 ++----

Dave Chinner reported a deadlock with XFS + DAX, which I reproduced
and bisected to this commit:

commit c2df40dfb8c015211ec55f4b1dd0587f875c7b34
Author: Mike Christie <mchristi@xxxxxxxxxx>
Date:   Sun Jun 5 14:32:17 2016 -0500
drivers: use req op accessor

Here are the steps to reproduce the deadlock with a BRD ramdisk:

mkfs.xfs -f /dev/ram0
mount -o dax /dev/ram0 /mnt/scratch
xfs_io -f -c "truncate 1g" /mnt/scratch/test.img
losetup -f --show /mnt/scratch/test.img
mkfs.xfs -f /dev/loop0

At this point the mkfs.xfs deadlocks.  Here is the stack trace
gathered via "echo w > /proc/sysrq-trigger" and passed through
kasan_symbolize.py:

brd: module loaded
XFS (ram0): DAX enabled. Warning: EXPERIMENTAL, use at your own risk
XFS (ram0): Mounting V5 Filesystem
XFS (ram0): Ending clean mount
sysrq: SysRq : Show Blocked State
  task                        PC stack   pid father
mkfs.xfs        D ffff88060ae47b38     0  1482   1287 0x00000000
 ffff88060ae47b38 00000000000079e8 ffff880610fd8d98 ffff880036011a40
 ffff8800aa6dcec0 ffff88060ae48000 ffff880610fd8d80 7fffffffffffffff
 ffff8800aa6dcec0 00000000024000c0 ffff88060ae47b50 ffffffff81aca775
Call Trace:
 [<ffffffff81aca775>] schedule+0x35/0x80 kernel/sched/core.c:3360
 [<ffffffff81acf431>] schedule_timeout+0x271/0x460 kernel/time/timer.c:1493
 [<ffffffff81ac9c34>] io_schedule_timeout+0xa4/0x110 kernel/sched/core.c:4969
 [<     inline     >] do_wait_for_common kernel/sched/completion.c:75
 [<     inline     >] __wait_for_common kernel/sched/completion.c:93
 [<     inline     >] wait_for_common_io kernel/sched/completion.c:107
 [<ffffffff81acb33f>] wait_for_completion_io+0xdf/0x120
kernel/sched/completion.c:155
 [<ffffffff81573206>] submit_bio_wait+0x66/0x90 block/bio.c:870
 [<ffffffff81588016>] blkdev_issue_discard+0x86/0xc0 block/blk-lib.c:115
 [<ffffffff8158ea23>] blk_ioctl_discard+0xa3/0xd0 block/ioctl.c:221
 [<ffffffff8158f5da>] blkdev_ioctl+0x60a/0x9e0 block/ioctl.c:510
 [<ffffffff812bddb3>] block_ioctl+0x43/0x50 fs/block_dev.c:1714
 [<     inline     >] vfs_ioctl fs/ioctl.c:43
 [<ffffffff8128ec72>] do_vfs_ioctl+0xa2/0x6a0 fs/ioctl.c:674
 [<     inline     >] SYSC_ioctl fs/ioctl.c:689
 [<ffffffff8128f2e9>] SyS_ioctl+0x79/0x90 fs/ioctl.c:680
 [<ffffffff81ad0abc>] entry_SYSCALL_64_fastpath+0x1f/0xbd
arch/x86/entry/entry_64.S:207

The line numbers are for the commit above, not for linux/master.  This
occurs 100% as of this commit, and 0% with the previous commit.

This doesn't occur if you don't use DAX, but based on the content of
the commit I'm guessing that difference is due to variations in the
way the two paths use discard.

- Ross

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel