On Wed, Sep 18, 2013 at 6:33 AM, Dan Van Der Ster <daniel.vanderster@xxxxxxx> wrote: > Hi, > We just finished debugging a problem with RBD-backed Glance image creation failures, and thought our workaround would be useful for others. Basically, we found that during an image upload, librbd on the glance api server was consuming many many processes, eventually hitting the 1024 nproc limit of non-root users in RHEL. The failure occurred when uploading to pools with 2048 PGs, but didn't fail when uploading to pools with 512 PGs (we're guessing that librbd is opening one thread per accessed-PG, and not closing those threads until the whole processes completes.) > > If you hit this same problem (and you run RHEL like us), you'll need to modify at least /etc/security/limits.d/90-nproc.conf (adding your non-root user that should be allowed > 1024 procs), and then also possibly run ulimit -u in the init script of your client process. Ubuntu should have some similar limits. Did your pools with 2048 PGs have a significantly larger number of OSDs in them? Or are both pools on a pool with a lot of OSDs relative to the PG counts? The PG count shouldn't matter for this directly, but RBD (and other clients) will create a couple messenger threads for each OSD it talks to, and while they'll eventually shut down on idle it doesn't proactively close them. I'd expect this to be a problem around 500 OSDs. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com