Re: NFS interaction with RBD

Simon Leinen <simon.leinen@xxxxxxxxx> · Mon, 15 Jun 2015 18:00:39 +0200

Trent Lloyd writes:
> Jens-Christian Fischer <jens-christian.fischer@...> writes:
>> 
>> I think we (i.e. Christian) found the problem:
>> We created a test VM with 9 mounted RBD volumes (no NFS server). As soon as 
> he hit all disks, we started to experience these 120 second timeouts. We 
> realized that the QEMU process on the hypervisor is opening a TCP connection 
> to every OSD for every mounted volume - exceeding the 1024 FD limit.
>> 
>> So no deep scrubbing etc, but simply to many connections…

> Have seen mention of similar from CERN in their presentations, found this 
> post on a quick google.. might help?

> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-December/026187.html

Yes, that's exactly the problem that we had.  We solved it by setting
max_files to 8191 in /etc/libvirt/qemu.conf on all compute hosts.

Once that was applied, we were able to live-migrate running instances
for them to enjoy the increased limit.
-- 
Simon.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com