On 07/19/2019 02:42 AM, Marc Schöchlin wrote: > We have ~500 heavy load rbd-nbd devices in our xen cluster (rbd-nbd 12.2.5, kernel 4.4.0+10, centos clone) and ~20 high load krbd devices (kernel 4.15.0-45, ubuntu 16.04) - we never experienced problems like this. For this setup, do you have 257 or more rbd-nbd devices running on a single system? If so then you are hitting another bug where newer kernels only support 256 devices. It looks like a regression was added when mq and netlink support was added upstream. You can create more then 256 devices, but some devices will not be able to execute any IO. Commands sent to the rbd-nbd device are going to always timeout and you will see the errors in your log. I am testing some patches for that right now. _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com