Hi all, Not getting very far with this query internally (RH), so hoping someone familiar with the code can spare me the C++ pain... We've hit soft thread count ulimits a couple of times with different Ceph clusters. The clients (Qemu/KVM guests on both Ubuntu and RHEL hosts) have hit the limit thanks to many socket fds to the Ceph cluster and then experienced weird (at least the first time) and difficult to debug (no qemu or libvirt logs) issues. The primary symptom seems to be an apparent IO hang in the guest with no well-defined trigger, i.e., the Ceph volumes seem to work initially but then somehow we hit the ulimit and no further guest IOs progress (iostat shows devices at 100% util but no IOPS). qemu.conf has a max_files setting for tuning the relevant system default ulimit on guests, but we've no idea what it needs to be (so for now have just gone very large). So, how many threads does librbd need? It seems to be relative to the size (#OSDs and/or #PGs) of the cluster, as in one case this issue popped up for a user with 10 RBD volumes attached to an OpenStack instance only after we added a handful of OSDs to expand the cluster (which pushed their qemu/kvm processes steady state fd usage from ~900 to ~1100, past the 1024 default). -- Cheers, ~Blairo _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com