On Thu, Sep 01 2016 at 7:17pm -0400, Bart Van Assche <bart.vanassche@xxxxxxxxxxx> wrote: > On 09/01/2016 03:27 PM, Mike Snitzer wrote: > >On Thu, Sep 01 2016 at 6:22pm -0400, > >Bart Van Assche <bart.vanassche@xxxxxxxxxxx> wrote: > > > >>On 09/01/2016 03:18 PM, Mike Snitzer wrote: > >>>FYI I get the same 'dmsetup suspend --nolockfs --noflush mp' hang, > >>>running mptest's test_02_sdev_delete, when I try your unmodified > >>>patchset, see: > >>> > >>>http://git.kernel.org/cgit/linux/kernel/git/snitzer/linux.git/log/?h=devel.bart > >> > >>Hello Mike, > >> > >>Are you aware that the code on that branch is a *modified* version > >>of my patch series? The following patch is not present on that > >>branch: "dm path selector: Avoid that device removal triggers an > >>infinite loop". There are also other (smaller) differences. > > > >No, you're obviously talking about the 'devel' branch and not the > >'devel.bart' branch I pointed to. The 'devel.bart' branch is the > >_exact_ patchset you sent. It has the same problem as the 'devel' > >branch. > > Hello Mike, > > Sorry that I misread your previous e-mail. After I received your > latest e-mail I rebased my tree on top of the devel.bart branch > mentioned above. My tests still pass. The only two patches in my > tree that are relevant and that are not in the devel.bart branch > have been attached to this e-mail. Did your test involve the sd > driver? If so, do the attached two patches help? If the sd driver > was not involved, can you provide more information about the hang > you ran into? The output and log messages generated by the following > commands after the hang has been reproduced would be very welcome: > * echo w > /proc/sysrq-trigger > * (cd /sys/block && grep -a '' dm*/mq/*/{pending,cpu*/rq_list}) sd is used. I'll apply those patches and test, tomorrow, but I'm pretty skeptical. Haven't had any problems with these tests for quite a while. The tests I'm running are just those in the mptest testsuite, see: https://github.com/snitm/mptest Running them should be as simple as you doing: git clone git://github.com/snitm/mptest.git cd mptest ./runtest The default is to use dm-mq on scsi-mq ontop of tcmloop. multipath -ll shows: mp () dm-4 LIO-ORG ,rd size=1.0G features='4 queue_if_no_path retain_attached_hw_handler queue_mode mq' hwhandler='1 alua' wp=rw |-+- policy='queue-length 0' prio=-1 status=active | |- 7:0:1:0 sdj 8:144 active ready running | `- 8:0:1:0 sdk 8:160 active ready running `-+- policy='queue-length 0' prio=-1 status=enabled |- 9:0:1:0 sdl 8:176 active ready running `- 10:0:1:0 sdm 8:192 active ready running [ 4839.452237] scsi host7: TCM_Loopback [ 4839.472788] scsi host8: TCM_Loopback [ 4839.492867] scsi host9: TCM_Loopback [ 4839.512841] scsi host10: TCM_Loopback [ 4839.549430] scsi 7:0:1:0: Direct-Access LIO-ORG rd 4.0 PQ: 0 ANSI: 5 [ 4839.570556] scsi 7:0:1:0: alua: supports implicit and explicit TPGS [ 4839.577562] scsi 7:0:1:0: alua: device naa.600140559050dd34f6e46deb7e0e9f24 port group 0 rel port 1 [ 4839.587810] sd 7:0:1:0: [sdj] 2097152 512-byte logical blocks: (1.07 GB/1.00 GiB) [ 4839.587830] sd 7:0:1:0: Attached scsi generic sg10 type 0 [ 4839.593569] sd 7:0:1:0: alua: transition timeout set to 60 seconds [ 4839.593572] sd 7:0:1:0: alua: port group 00 state A non-preferred supports TOlUSNA [ 4839.608254] scsi 8:0:1:0: Direct-Access LIO-ORG rd 4.0 PQ: 0 ANSI: 5 [ 4839.626620] sd 7:0:1:0: [sdj] Write Protect is off [ 4839.631974] sd 7:0:1:0: [sdj] Mode Sense: 43 00 00 08 [ 4839.631999] sd 7:0:1:0: [sdj] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [ 4839.642209] loopback/naa.50014056fcae4fb4: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION. [ 4839.652646] sd 7:0:1:0: [sdj] Attached SCSI disk [ 4839.673568] scsi 8:0:1:0: alua: supports implicit and explicit TPGS [ 4839.680573] scsi 8:0:1:0: alua: device naa.600140559050dd34f6e46deb7e0e9f24 port group 0 rel port 2 [ 4839.690814] sd 8:0:1:0: [sdk] 2097152 512-byte logical blocks: (1.07 GB/1.00 GiB) [ 4839.690888] sd 8:0:1:0: Attached scsi generic sg11 type 0 [ 4839.696543] sd 8:0:1:0: alua: port group 00 state A non-preferred supports TOlUSNA [ 4839.711419] scsi 9:0:1:0: Direct-Access LIO-ORG rd 4.0 PQ: 0 ANSI: 5 [ 4839.722730] sd 8:0:1:0: [sdk] Write Protect is off [ 4839.728076] sd 8:0:1:0: [sdk] Mode Sense: 43 00 00 08 [ 4839.728094] sd 8:0:1:0: [sdk] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [ 4839.738298] loopback/naa.500140553365fbe6: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION. [ 4839.748700] sd 8:0:1:0: [sdk] Attached SCSI disk [ 4839.771561] scsi 9:0:1:0: alua: supports implicit and explicit TPGS [ 4839.778567] scsi 9:0:1:0: alua: device naa.600140559050dd34f6e46deb7e0e9f24 port group 0 rel port 3 [ 4839.788794] sd 9:0:1:0: [sdl] 2097152 512-byte logical blocks: (1.07 GB/1.00 GiB) [ 4839.788823] sd 9:0:1:0: Attached scsi generic sg12 type 0 [ 4839.794546] sd 9:0:1:0: alua: port group 00 state A non-preferred supports TOlUSNA [ 4839.809308] scsi 10:0:1:0: Direct-Access LIO-ORG rd 4.0 PQ: 0 ANSI: 5 [ 4839.820806] sd 9:0:1:0: [sdl] Write Protect is off [ 4839.826161] sd 9:0:1:0: [sdl] Mode Sense: 43 00 00 08 [ 4839.826181] sd 9:0:1:0: [sdl] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [ 4839.836379] loopback/naa.5001405631dca816: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION. [ 4839.846762] sd 9:0:1:0: [sdl] Attached SCSI disk [ 4839.856572] scsi 10:0:1:0: alua: supports implicit and explicit TPGS [ 4839.863673] scsi 10:0:1:0: alua: device naa.600140559050dd34f6e46deb7e0e9f24 port group 0 rel port 4 [ 4839.874002] sd 10:0:1:0: [sdm] 2097152 512-byte logical blocks: (1.07 GB/1.00 GiB) [ 4839.874033] sd 10:0:1:0: Attached scsi generic sg13 type 0 [ 4839.879549] sd 10:0:1:0: alua: port group 00 state A non-preferred supports TOlUSNA [ 4839.897162] sd 10:0:1:0: [sdm] Write Protect is off [ 4839.902613] sd 10:0:1:0: [sdm] Mode Sense: 43 00 00 08 [ 4839.902632] sd 10:0:1:0: [sdm] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [ 4839.912935] loopback/naa.5001405afca06b48: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION. [ 4839.923291] sd 10:0:1:0: [sdm] Attached SCSI disk [ 4841.065972] device-mapper: multipath queue-length: version 0.2.0 loaded -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel