Re: Jewel Multisite RGW Memory Issues

Pritha Srivastava <prsrivas@xxxxxxxxxx> · Sun, 26 Jun 2016 23:45:36 -0400 (EDT)

I have 2 distinct clusters configured, in 2 different locations, and 1
zonegroup.

Cluster 1 has ~11TB of data currently on it, S3 / Swift backups via
the duplicity backup tool - each file is 25Mb and probably 20% are
multipart uploads from S3 (so 4Mb stripes) - in total 3217kobjects.
This cluster has been running for months (without RGW replication)
with no issue. Each site has 1 RGW instance at the moment.

I recently set up the second cluster on identical hardware in a
secondary site. I configured a multi-site setup, with both of these
sites in an active-active configuration. The second cluster has no
active data set, so I would expect site 1 to start mirroring to site 2
- and it does.

Unfortunately as soon as the RGW syncing starts to run, the resident
memory usage of radosgw instances on both clusters balloons massively
until the process is OOMed. This isn't a slow leak - when testing I've
found that the radosgw processes on either side can consume up to
300MB/s of extra RSS per *second*, completely ooming a machine with
96GB of ram in approximately 20 minutes.

If I stop the radosgw processes on one cluster (i.e. breaking
replication) then the memory usage of the radosgw processes on the
other cluster stays at around 100-500MB and does not really increase
over time.

Obviously this makes multi-site replication completely unusable so
wondering if anyone has a fix or workaround. I noticed some pull
requests have been merged into the master branch for RGW memory leak
fixes so I switched to v10.2.0-2453-g94fac96 from autobuild packages,
it seems like this slows the memory increase slightly but not enough
to make replication usable yet.

I've tried valgrinding the radosgw process but doesn't come up with
anything obviously leaking (I could be doing it wrong), but an example
of the memory ballooning is captured by collectd:
http://i.imgur.com/jePYnwz.png - this memory usage is *all* on the
radosgw process RSS.

Anyone else seen this?

Do you know if the memory usage is high only during load from clients and is steady otherwise?
What was the nature of the workload at the time of the sync operation?

Thanks,
Pritha
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com