On Dec 16, 2013 8:26 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote: > > On Mon, Dec 16, 2013 at 11:08 AM, Dan van der Ster > <daniel.vanderster@xxxxxxx> wrote: > > Hi, > > > > Sorry to revive this old thread, but I wanted to update you on the current > > pains we're going through related to clients' nproc (and now nofile) > > ulimits. When I started this thread we were using RBD for Glance images > > only, but now we're trying to enable RBD-backed Cinder volumes and are not > > really succeeding at the moment :( > > > > As we had guessed from our earlier experience, librbd and therefore qemu-kvm > > need increased nproc/nofile limits otherwise VMs will freeze. In fact we > > just observed a lockup of a test VM due to the RBD device blocking > > completely (this appears as blocked flush processes in the VM); we're > > actually not sure which of the nproc/nofile limits caused the freeze, but it > > was surely one of those. > > > > And the main problem we face now is that it isn't trivial to increase the > > limits of qemu-kvm on a running OpenStack hypervisor -- the values are set > > by libvirtd and seem to require a restart of all guest VMs on a host to > > reload a qemu config file. I'll update this thread when we find the solution > > to that... > > Is there some reason you can't just set it ridiculously high to start with? > As I mentioned, we haven't yet found a way to change the limits without affecting (stopping) the existing running (important) VMs. We thought that /etc/security/limits.conf would do the trick, but alas limits there have no effect on qemu. Cheers, Dan > > Moving forward, IMHO it would be much better if Ceph clients could > > gracefully work with large clusters without _requiring_ changes to the > > ulimits. I understand that such poorly configured clients would necessarily > > have decreased performance (since librados would need to use a thread pool > > and also lose some of the persistent client-OSD connections). But client > > lockups are IMHO worse that slightly lower performance. > > > > Have you guys discussed the client ulimit issues recently and is there a > > plan in the works? > > I'm afraid not. It's a plannable but non-trivial amount of work and > the Inktank dev team is pretty well booked for a while. Anybody > running into this as a serious bottleneck should > 1) try and start a community effort > 2) try and promote it as a priority with any Inktank business contacts > they have. > (You are only the second group to report it as an ongoing concern > rather than a one-off hiccup, and honestly it sounds like you're just > having issues with hitting the arbitrary limits, not with real > resource exhaustion issues.) > :) > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com