That should obviously be unmap() { rbd-nbd unmap } trap unmap EXIT On Wed, Dec 21, 2022 at 10:32 PM Josef Johansson <josef86@xxxxxxxxx> wrote: > > Right, I actually ended up deadlocking rbd-nbd, that's why I switched > over to rbd-replay. > The flow was > > rbd-nbd map & > unmap() > { > rbd-nbd unmap > } > while true; do > lsblk --noempty /dev/nbd0 > r=$? > [ $r -eq 32 ] && continue > [ $r -eq 0 ] && break > done > dd if=/dev/random of=/dev/nbd0 bs=4096 count=1 oflag=sync > > What I did was to ctrl+c the process directly as I started. Maybe > adding the following just before dd would be enough. > Sadly I have to reboot the whole VM afterwards :) > > deadlock() > { > sleep 0.1 > exit 1 > } > deadlock & > > On Wed, Dec 21, 2022 at 10:22 PM Sam Perman <sam@xxxxxxxx> wrote: > > > > Thanks, i'll take a look at that. For reference, the deadlock we are seeing looks similar to the one described at the bottom of this issue: https://tracker.ceph.com/issues/52088 > > > > thanks > > sam > > > > On Wed, Dec 21, 2022 at 4:04 PM Josef Johansson <josef86@xxxxxxxxx> wrote: > >> > >> Hi, > >> > >> I made some progress with my testing on a similat issue. Maybe the test will be easy to adapt tonyour case. > >> > >> https://tracker.ceph.com/issues/57396 > >> > >> What I can say though is that I don't see the deadlock problem in my testing. > >> > >> Cheers > >> -Josef > >> > >> On Wed, 21 Dec 2022 at 22:00, Sam Perman <sam@xxxxxxxx> wrote: > >>> > >>> Hello! > >>> > >>> I'm trying to chase down a deadlock we occasionally see on the client side > >>> when using rbd-nbd and have a question about a lingering process we are > >>> seeing. > >>> > >>> I have a simple test script that will execute the following in order: > >>> > >>> * use rbd to create a new image > >>> * use rbd-nbd to map the image locally > >>> * mkfs a file system > >>> * mount the image locally > >>> * use dd to write some dummy data > >>> * unmount the device > >>> * use rbd-nbd to unmap the image > >>> * use rbd to remove the image > >>> > >>> After this is all done, there is a lingering process that I'm curious > >>> about. > >>> > >>> The process is called "[kworker/u9:0-knbd0-recv]" (in state "I") and is a > >>> child of "[kthreadd]" (in state "S"). > >>> > >>> Is this normal? I don't see any specific problems with it but I'm > >>> eventually going to ramp up this test to use a lot of concurrency to see if > >>> I can reproduce the deadlock we are seeing, and want to make sure I'm > >>> starting clean.) > >>> > >>> Thanks for any insight you have! > >>> sam > >>> _______________________________________________ > >>> ceph-users mailing list -- ceph-users@xxxxxxx > >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx