Re: ulimit max user processes (-u) and non-root ceph clients

Dan van der Ster <daniel.vanderster@xxxxxxx> · Tue, 17 Dec 2013 15:58:58 +0100

On Mon, Dec 16, 2013 at 8:36 PM, Dan Van Der Ster
<daniel.vanderster@xxxxxxx> wrote:
>
> On Dec 16, 2013 8:26 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
>>
>> On Mon, Dec 16, 2013 at 11:08 AM, Dan van der Ster
>> <daniel.vanderster@xxxxxxx> wrote:
>> > Hi,
>> >
>> > Sorry to revive this old thread, but I wanted to update you on the current
>> > pains we're going through related to clients' nproc (and now nofile)
>> > ulimits. When I started this thread we were using RBD for Glance images
>> > only, but now we're trying to enable RBD-backed Cinder volumes and are not
>> > really succeeding at the moment :(
>> >
>> > As we had guessed from our earlier experience, librbd and therefore qemu-kvm
>> > need increased nproc/nofile limits otherwise VMs will freeze. In fact we
>> > just observed a lockup of a test VM due to the RBD device blocking
>> > completely (this appears as blocked flush processes in the VM); we're
>> > actually not sure which of the nproc/nofile limits caused the freeze, but it
>> > was surely one of those.
>> >
>> > And the main problem we face now is that it isn't trivial to increase the
>> > limits of qemu-kvm on a running OpenStack hypervisor -- the values are set
>> > by libvirtd and seem to require a restart of all guest VMs on a host to
>> > reload a qemu config file. I'll update this thread when we find the solution
>> > to that...
>>
>> Is there some reason you can't just set it ridiculously high to start with?
>>
>
> As I mentioned, we haven't yet found a way to change the limits without affecting (stopping) the existing running (important) VMs. We thought that /etc/security/limits.conf would do the trick, but alas limits there have no effect on qemu.
>

Well today we solved it. Edit /etc/libvirt/qemu.conf to set
max_files=32768 and max_processes=32768, then restart libvirtd. This
doesn't affect the running VMs but allows new VMs (or manually
rebooted VMs) to get the correct limits and thus work properly with
RBD.

Best Regards, Dan

> Cheers, Dan
>
>> > Moving forward, IMHO it would be much better if Ceph clients could
>> > gracefully work with large clusters without _requiring_ changes to the
>> > ulimits. I understand that such poorly configured clients would necessarily
>> > have decreased performance (since librados would need to use a thread pool
>> > and also lose some of the persistent client-OSD connections). But client
>> > lockups are IMHO worse that slightly lower performance.
>> >
>> > Have you guys discussed the client ulimit issues recently and is there a
>> > plan in the works?
>>
>> I'm afraid not. It's a plannable but non-trivial amount of work and
>> the Inktank dev team is pretty well booked for a while. Anybody
>> running into this as a serious bottleneck should
>> 1) try and start a community effort
>> 2) try and promote it as a priority with any Inktank business contacts
>> they have.
>> (You are only the second group to report it as an ongoing concern
>> rather than a one-off hiccup, and honestly it sounds like you're just
>> having issues with hitting the arbitrary limits, not with real
>> resource exhaustion issues.)
>> :)
>> -Greg
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com