Hello Nikos, Tested with your new patches. Issue is resolved. Thank you. In second patch "struct wait_queue_head" to "wait_queue_head_t" for variable in_progress_wait, else compilation is failing with error "error: field 'in_progress_wait' has incomplete type struct wait_queue_head in_progress_wait;" Attached the changed patch. Guru On Sat, 12 Oct 2019 at 14:16, Guruswamy Basavaiah <guru2018@xxxxxxxxx> wrote: > > Hello Nikos, > I am having some issues in our set-up, I will try to get the results ASAP. > Guru > > > On Fri, 11 Oct 2019 at 17:47, Nikos Tsironis <ntsironis@xxxxxxxxxxx> wrote: > > > > On 10/11/19 2:39 PM, Nikos Tsironis wrote: > > > On 10/11/19 1:17 PM, Guruswamy Basavaiah wrote: > > >> Hello Nikos, > > >> Applied these patches and tested. > > >> We still see hung_task_timeout back traces and the drbd Resync is blocked. > > >> Attached the back trace, please let me know if you need any other information. > > >> > > > > > > Hi Guru, > > > > > > Can you provide more information about your setup? The output of > > > 'dmsetup table', 'dmsetup ls --tree' and the DRBD configuration would > > > help to get a better picture of your I/O stack. > > > > > > Also, is it possible to describe the test case you are running and > > > exactly what it does? > > > > > > Thanks, > > > Nikos > > > > > > > Hi Guru, > > > > I believe I found the mistake. The in_progress variable was never > > initialized to zero. > > > > I attach a new version of the second patch correcting this. > > > > Can you please test again with this patch? > > > > Thanks, > > Nikos > > > > >> In patch "0002-dm-snapshot-rework-COW-throttling-to-fix-deadlock.patch" > > >> I change "struct wait_queue_head" to "wait_queue_head_t" as i was > > >> getting compilation error with former one. > > >> > > >> On Thu, 10 Oct 2019 at 17:33, Nikos Tsironis <ntsironis@xxxxxxxxxxx> wrote: > > >>> > > >>> On 10/10/19 9:34 AM, Guruswamy Basavaiah wrote: > > >>>> Hello, > > >>>> We use 4.4.184 in our builds and the patch fails to apply. > > >>>> Is it possible to give a patch for 4.4.x branch ? > > >>> Hi Guru, > > >>> > > >>> I attach the two patches fixing the deadlock rebased on the 4.4.x branch. > > >>> > > >>> Nikos > > >>> > > >>>> > > >>>> patching Logs. > > >>>> patching file drivers/md/dm-snap.c > > >>>> Hunk #1 succeeded at 19 (offset 1 line). > > >>>> Hunk #2 succeeded at 105 (offset -1 lines). > > >>>> Hunk #3 succeeded at 157 (offset -4 lines). > > >>>> Hunk #4 succeeded at 1206 (offset -120 lines). > > >>>> Hunk #5 FAILED at 1508. > > >>>> Hunk #6 succeeded at 1412 (offset -124 lines). > > >>>> Hunk #7 succeeded at 1425 (offset -124 lines). > > >>>> Hunk #8 FAILED at 1925. > > >>>> Hunk #9 succeeded at 1866 with fuzz 2 (offset -255 lines). > > >>>> Hunk #10 succeeded at 2202 (offset -294 lines). > > >>>> Hunk #11 succeeded at 2332 (offset -294 lines). > > >>>> 2 out of 11 hunks FAILED -- saving rejects to file drivers/md/dm-snap.c.rej > > >>>> > > >>>> Guru > > >>>> > > >>>> On Thu, 10 Oct 2019 at 01:33, Guruswamy Basavaiah <guru2018@xxxxxxxxx> wrote: > > >>>>> > > >>>>> Hello Mike, > > >>>>> I will get the testing result before end of Thursday. > > >>>>> Guru > > >>>>> > > >>>>> On Wed, 9 Oct 2019 at 21:34, Mike Snitzer <snitzer@xxxxxxxxxx> wrote: > > >>>>>> > > >>>>>> On Wed, Oct 09 2019 at 11:44am -0400, > > >>>>>> Nikos Tsironis <ntsironis@xxxxxxxxxxx> wrote: > > >>>>>> > > >>>>>>> On 10/9/19 5:13 PM, Mike Snitzer wrote:> On Tue, Oct 01 2019 at 8:43am -0400, > > >>>>>>>> Nikos Tsironis <ntsironis@xxxxxxxxxxx> wrote: > > >>>>>>>> > > >>>>>>>>> On 10/1/19 3:27 PM, Guruswamy Basavaiah wrote: > > >>>>>>>>>> Hello Nikos, > > >>>>>>>>>> Yes, issue is consistently reproducible with us, in a particular > > >>>>>>>>>> set-up and test case. > > >>>>>>>>>> I will get the access to set-up next week, will try to test and let > > >>>>>>>>>> you know the results before end of next week. > > >>>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> That sounds great! > > >>>>>>>>> > > >>>>>>>>> Thanks a lot, > > >>>>>>>>> Nikos > > >>>>>>>> > > >>>>>>>> Hi Guru, > > >>>>>>>> > > >>>>>>>> Any chance you could try this fix that I've staged to send to Linus? > > >>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.4&id=633b1613b2a49304743c18314bb6e6465c21fd8a > > >>>>>>>> > > >>>>>>>> Shiort of that, Nikos: do you happen to have a test scenario that teases > > >>>>>>>> out this deadlock? > > >>>>>>>> > > >>>>>>> > > >>>>>>> Hi Mike, > > >>>>>>> > > >>>>>>> Yes, > > >>>>>>> > > >>>>>>> I created a 50G LV and took a snapshot of the same size: > > >>>>>>> > > >>>>>>> lvcreate -n data-lv -L50G testvg > > >>>>>>> lvcreate -n snap-lv -L50G -s testvg/data-lv > > >>>>>>> > > >>>>>>> Then I ran the following fio job: > > >>>>>>> > > >>>>>>> [global] > > >>>>>>> randrepeat=1 > > >>>>>>> ioengine=libaio > > >>>>>>> bs=1M > > >>>>>>> size=6G > > >>>>>>> offset_increment=6G > > >>>>>>> numjobs=8 > > >>>>>>> direct=1 > > >>>>>>> iodepth=32 > > >>>>>>> group_reporting > > >>>>>>> filename=/dev/testvg/data-lv > > >>>>>>> > > >>>>>>> [test] > > >>>>>>> rw=write > > >>>>>>> timeout=180 > > >>>>>>> > > >>>>>>> , concurrently with the following script: > > >>>>>>> > > >>>>>>> lvcreate -n dummy-lv -L1G testvg > > >>>>>>> > > >>>>>>> while true > > >>>>>>> do > > >>>>>>> lvcreate -n dummy-snap -L1M -s testvg/dummy-lv > > >>>>>>> lvremove -f testvg/dummy-snap > > >>>>>>> done > > >>>>>>> > > >>>>>>> This reproduced the deadlock for me. I also ran 'echo 30 > > > >>>>>>> /proc/sys/kernel/hung_task_timeout_secs', to reduce the hung task > > >>>>>>> timeout. > > >>>>>>> > > >>>>>>> Nikos. > > >>>>>> > > >>>>>> Very nice, well done. Curious if you've tested with the fix I've staged > > >>>>>> (see above)? If so, does it resolve the deadlock? If you've had > > >>>>>> success I'd be happy to update the tags in the commit header to include > > >>>>>> your Tested-by before sending it to Linus. Also, any review of the > > >>>>>> patch that you can do would be appreciated and with your formal > > >>>>>> Reviewed-by reply would be welcomed and folded in too. > > >>>>>> > > >>>>>> Mike > > >>>>> > > >>>>> > > >>>>> > > >>>>> -- > > >>>>> Guruswamy Basavaiah > > >>>> > > >>>> > > >>>> > > >> > > >> > > >> > > > > -- > Guruswamy Basavaiah -- Guruswamy Basavaiah -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel