This was enough. root 156091 0.2 0.2 619356 100816 ? S+ 22:39 0:00 rbd-nbd --device /dev/nbd0 map deadlock root 156103 0.1 0.2 1463720 89992 ? Dl+ 22:39 0:00 rbd-nbd --device /dev/nbd0 map deadlock root 156121 0.3 0.2 116528 97020 ? D+ 22:39 0:00 rbd-nbd --device /dev/nbd0 unmap deadlock #!/bin/bash -x rbd create deadlock --size 32 rbd-nbd --device /dev/nbd0 map deadlock & unmap() { rbd-nbd --device /dev/nbd0 unmap deadlock rbd rm deadlock } trap unmap EXIT while true; do lsblk --noempty /dev/nbd0 r=$? [ $r -eq 32 ] && continue [ $r -eq 0 ] && break done dd if=/dev/random of=/dev/nbd0 bs=4096 count=1024 oflag=sync & sleep 0.1 On Wed, Dec 21, 2022 at 10:33 PM Josef Johansson <josef86@xxxxxxxxx> wrote: > > That should obviously be > unmap() > { > rbd-nbd unmap > } > trap unmap EXIT > > On Wed, Dec 21, 2022 at 10:32 PM Josef Johansson <josef86@xxxxxxxxx> wrote: > > > > Right, I actually ended up deadlocking rbd-nbd, that's why I switched > > over to rbd-replay. > > The flow was > > > > rbd-nbd map & > > unmap() > > { > > rbd-nbd unmap > > } > > while true; do > > lsblk --noempty /dev/nbd0 > > r=$? > > [ $r -eq 32 ] && continue > > [ $r -eq 0 ] && break > > done > > dd if=/dev/random of=/dev/nbd0 bs=4096 count=1 oflag=sync > > > > What I did was to ctrl+c the process directly as I started. Maybe > > adding the following just before dd would be enough. > > Sadly I have to reboot the whole VM afterwards :) > > > > deadlock() > > { > > sleep 0.1 > > exit 1 > > } > > deadlock & > > > > On Wed, Dec 21, 2022 at 10:22 PM Sam Perman <sam@xxxxxxxx> wrote: > > > > > > Thanks, i'll take a look at that. For reference, the deadlock we are seeing looks similar to the one described at the bottom of this issue: https://tracker.ceph.com/issues/52088 > > > > > > thanks > > > sam > > > > > > On Wed, Dec 21, 2022 at 4:04 PM Josef Johansson <josef86@xxxxxxxxx> wrote: > > >> > > >> Hi, > > >> > > >> I made some progress with my testing on a similat issue. Maybe the test will be easy to adapt tonyour case. > > >> > > >> https://tracker.ceph.com/issues/57396 > > >> > > >> What I can say though is that I don't see the deadlock problem in my testing. > > >> > > >> Cheers > > >> -Josef > > >> > > >> On Wed, 21 Dec 2022 at 22:00, Sam Perman <sam@xxxxxxxx> wrote: > > >>> > > >>> Hello! > > >>> > > >>> I'm trying to chase down a deadlock we occasionally see on the client side > > >>> when using rbd-nbd and have a question about a lingering process we are > > >>> seeing. > > >>> > > >>> I have a simple test script that will execute the following in order: > > >>> > > >>> * use rbd to create a new image > > >>> * use rbd-nbd to map the image locally > > >>> * mkfs a file system > > >>> * mount the image locally > > >>> * use dd to write some dummy data > > >>> * unmount the device > > >>> * use rbd-nbd to unmap the image > > >>> * use rbd to remove the image > > >>> > > >>> After this is all done, there is a lingering process that I'm curious > > >>> about. > > >>> > > >>> The process is called "[kworker/u9:0-knbd0-recv]" (in state "I") and is a > > >>> child of "[kthreadd]" (in state "S"). > > >>> > > >>> Is this normal? I don't see any specific problems with it but I'm > > >>> eventually going to ramp up this test to use a lot of concurrency to see if > > >>> I can reproduce the deadlock we are seeing, and want to make sure I'm > > >>> starting clean.) > > >>> > > >>> Thanks for any insight you have! > > >>> sam > > >>> _______________________________________________ > > >>> ceph-users mailing list -- ceph-users@xxxxxxx > > >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx