RBD Unmap busy while no "normal" process holds it.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I've got a strange issue with ceph (ceph-adm).

We use Incus (LXD reborn) and we used to add/remove containers.

We had moved to Ceph for  better scalability but sometimes we get this bug
:

https://discuss.linuxcontainers.org/t/howto-delete-container-with-ceph-rbd-volume-giving-device-or-resource-busy/5910

And I respawn this old bug in Incus :
https://discuss.linuxcontainers.org/t/incus-0-x-and-ceph-rbd-map-is-sometimes-busy/19585/6
(thanks for Stephane GRABER help !)

After working on it for several hours. I find this :

I do a loop who do this script  (create image / map it / format it /mount
it / write on it / unmount / unmap / delete image) :
rbd create image1 --size 1024 --pool customers-clouds.ix-mrs2.fr.eho ||
exit $?
RBD_DEVICE=$(rbd map customers-clouds.ix-mrs2.fr.eho/image1 || exit $? )
mkfs.ext4 ${RBD_DEVICE} || exit $?
mount ${RBD_DEVICE} /media/test || exit $?
dd if=/dev/zero of=/media/test/test.out
sleep 10
rm /media/test/test.out || exit $?
umount ${RBD_DEVICE} || exit $?
rbd unmap ${RBD_DEVICE} || exit $?
rbd rm customers-clouds.ix-mrs2.fr.eho/image1 || exit $?
sleep 1

It works for hours without any issue BUT if I add an OSD while doing this
loop I get this :

+ sleep 10
+ rm /media/test/test.out
+ umount /dev/rbd0
+ rbd unmap /dev/rbd0
rbd: sysfs write failed
rbd: unmap failed: (16) Device or resource busy
+ exit 16

And ... of course the winner is always the podman of the latest OSD i've added
(it holds a copy of the mounting point)
root@ceph01:~# grep rbd0 /proc/*/mountinfo/proc/1415299/mountinfo:1959
1837 252:0 / /rootfs/media/test rw,relatime - ext4 /dev/rbd0
rw,stripe=16/proc/1415301/mountinfo:1959 1837 252:0 /
/rootfs/media/test rw,relatime - ext4 /dev/rbd0 rw,stripe=16
root@ceph01:~# cat /proc/1415299/cmdline
/run/podman-init--/usr/bin/ceph-osd-nosd.26-f--setuserceph--setgroupceph--default-log-to-file=false--default-log-to-journald=true--default-log-to-stderr=falseroot@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.admin:~#
root@ceph01:~# cat /proc/1415301/cmdline
/usr/bin/ceph-osd-nosd.26-f--setuserceph--setgroupceph--default-log-to-file=false--default-log-to-journald=true--default-log-to-stderr=falseroot@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.admin:~#

Our setup is straight forward : latest stable Ceph release, latest
Debian stable release, and deploying via ceph-adm

Did someone have such a problem ? Where is the best place to report this bug ?
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux