On Wed, Dec 2, 2015 at 11:09 AM, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote: > Hi, > > We've had increased user activity on our radosgw boxes the past two > days and are finding that the radosgw is growing quickly in used > memory. Most of our gateways are VMs with 4GB of memory and these are > getting OOM-killed after ~30 mins of high user load. We added a few > physical gateways with 64GB of ram and overnight those have grown from > zero to more than 8GB, and are still growing. > > I'm not a valgrind expert, but I've been running one of the daemons like this: > > valgrind --leak-check=full /bin/radosgw -n client.radosgw.cephrgw -f > > but it's not reporting any leaks, even though the memory usage is > climbing for that process. > > Anyone seen something similar? Any tips for tracking this down? My > next (random) step will be to disable the rgw_cache and see if that > helps. Neither changing the lru cache size, nor disabling the rgw_cache competely seems to make a difference. We're now checking if the keystone s3 integration feature could be to blame -- just enabled that a couple days ago and it seems to correlate. -- dan _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com