On 10/9/19 7:04 PM, Mike Snitzer wrote: > On Wed, Oct 09 2019 at 11:44am -0400, > Nikos Tsironis <ntsironis@xxxxxxxxxxx> wrote: > >> On 10/9/19 5:13 PM, Mike Snitzer wrote:> On Tue, Oct 01 2019 at 8:43am -0400, >>> Nikos Tsironis <ntsironis@xxxxxxxxxxx> wrote: >>> >>>> On 10/1/19 3:27 PM, Guruswamy Basavaiah wrote: >>>>> Hello Nikos, >>>>> Yes, issue is consistently reproducible with us, in a particular >>>>> set-up and test case. >>>>> I will get the access to set-up next week, will try to test and let >>>>> you know the results before end of next week. >>>>> >>>> >>>> That sounds great! >>>> >>>> Thanks a lot, >>>> Nikos >>> >>> Hi Guru, >>> >>> Any chance you could try this fix that I've staged to send to Linus? >>> https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.4&id=633b1613b2a49304743c18314bb6e6465c21fd8a >>> >>> Shiort of that, Nikos: do you happen to have a test scenario that teases >>> out this deadlock? >>> >> >> Hi Mike, >> >> Yes, >> >> I created a 50G LV and took a snapshot of the same size: >> >> lvcreate -n data-lv -L50G testvg >> lvcreate -n snap-lv -L50G -s testvg/data-lv >> >> Then I ran the following fio job: >> >> [global] >> randrepeat=1 >> ioengine=libaio >> bs=1M >> size=6G >> offset_increment=6G >> numjobs=8 >> direct=1 >> iodepth=32 >> group_reporting >> filename=/dev/testvg/data-lv >> >> [test] >> rw=write >> timeout=180 >> >> , concurrently with the following script: >> >> lvcreate -n dummy-lv -L1G testvg >> >> while true >> do >> lvcreate -n dummy-snap -L1M -s testvg/dummy-lv >> lvremove -f testvg/dummy-snap >> done >> >> This reproduced the deadlock for me. I also ran 'echo 30 > >> /proc/sys/kernel/hung_task_timeout_secs', to reduce the hung task >> timeout. >> >> Nikos. > > Very nice, well done. Curious if you've tested with the fix I've staged > (see above)? If so, does it resolve the deadlock? If you've had > success I'd be happy to update the tags in the commit header to include > your Tested-by before sending it to Linus. Also, any review of the > patch that you can do would be appreciated and with your formal > Reviewed-by reply would be welcomed and folded in too. > Yes, I have tested the staged fix. I forgot to mention it in my previous mail. I ran the test for the default 'snapshot_cow_threshold' value of 2048 and I also ran it for a value of 1, to stress it a little more. In both cases everything went fine, the deadlock was gone. Nikos > Mike > -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel