Re: lingering process when using rbd-nbd

Josef Johansson <josef86@xxxxxxxxx> · Wed, 21 Dec 2022 22:32:16 +0100

Right, I actually ended up deadlocking rbd-nbd, that's why I switched
over to rbd-replay.
The flow was

rbd-nbd map &
unmap()
{
  rbd-nbd unmap
}
while true; do
  lsblk --noempty /dev/nbd0
  r=$?
  [ $r -eq 32 ] && continue
  [ $r -eq 0 ] && break
done
dd if=/dev/random of=/dev/nbd0 bs=4096 count=1 oflag=sync

What I did was to ctrl+c the process directly as I started. Maybe
adding the following just before dd would be enough.
Sadly I have to reboot the whole VM afterwards :)

deadlock()
{
  sleep 0.1
  exit 1
}
deadlock &

On Wed, Dec 21, 2022 at 10:22 PM Sam Perman <sam@xxxxxxxx> wrote:
>
> Thanks, i'll take a look at that.  For reference, the deadlock we are seeing looks similar to the one described at the bottom of this issue: https://tracker.ceph.com/issues/52088
>
> thanks
> sam
>
> On Wed, Dec 21, 2022 at 4:04 PM Josef Johansson <josef86@xxxxxxxxx> wrote:
>>
>> Hi,
>>
>> I made some progress with my testing on a similat issue. Maybe the test will be easy to adapt tonyour case.
>>
>> https://tracker.ceph.com/issues/57396
>>
>> What I can say though is that I don't see the deadlock problem in my testing.
>>
>> Cheers
>> -Josef
>>
>> On Wed, 21 Dec 2022 at 22:00, Sam Perman <sam@xxxxxxxx> wrote:
>>>
>>> Hello!
>>>
>>> I'm trying to chase down a deadlock we occasionally see on the client side
>>> when using rbd-nbd and have a question about a lingering process we are
>>> seeing.
>>>
>>> I have a simple test script that will execute the following in order:
>>>
>>> * use rbd to create a new image
>>> * use rbd-nbd to map the image locally
>>> * mkfs a file system
>>> * mount the image locally
>>> * use dd to write some dummy data
>>> * unmount the device
>>> * use rbd-nbd to unmap the image
>>> * use rbd to remove the image
>>>
>>> After this is all done, there is a lingering process that I'm curious
>>> about.
>>>
>>> The process is called "[kworker/u9:0-knbd0-recv]" (in state "I") and is a
>>> child of "[kthreadd]" (in state "S").
>>>
>>> Is this normal? I don't see any specific problems with it but I'm
>>> eventually going to ramp up this test to use a lot of concurrency to see if
>>> I can reproduce the deadlock we are seeing, and want to make sure I'm
>>> starting clean.)
>>>
>>> Thanks for any insight you have!
>>> sam
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx