Dealing with radosgw and large OSD LevelDBs: compact, start over, something else?

Florian Haas <florian@xxxxxxxxxxx> · Thu, 17 Dec 2015 18:16:31 +0100

Hey everyone,

I recently got my hands on a cluster that has been underperforming in
terms of radosgw throughput, averaging about 60 PUTs/s with 70K
objects where a freshly-installed cluster with near-identical
configuration would do about 250 PUTs/s. (Neither of these values are
what I'd consider high throughput, but this is just to give you a feel
about the relative performance hit.)

Some digging turned up that of the less than 200 buckets in the
cluster, about 40 held in excess of a million objects (1-4M), which
one bucket being an outlier with 45M objects. All buckets were created
post-Hammer, and use 64 index shards. The total number of objects in
radosgw is approx. 160M.

Now this isn't a large cluster in terms of OSD distribution; there are
only 12 OSDs (after all, we're only talking double-digit terabytes
here). In almost all of these OSDs, the LevelDB omap directory has
grown to a size of 10-20 GB.

So I have several questions on this:

- Is it correct to assume that such a large LevelDB would be quite
detrimental to radosgw performance overall?

- If so, would clearing that one large bucket and distributing the
data over several new buckets reduce the LevelDB size at all?

- Is there even something akin to "ceph mon compact" for OSDs?

- Are these large LevelDB databases a simple consequence of having a
combination of many radosgw objects and few OSDs, with the
distribution per-bucket being comparatively irrelevant?

I do understand that the 45M object bucket itself would have been a
problem pre-Hammer, with no index sharding available. But with what
others have shared here, a rule of thumb of one index shard per
million objects should be a good one to follow, so 64 shards for 45M
objects doesn't strike me as totally off the mark. That's why I think
LevelDB I/O is actually the issue here. But I might be totally wrong;
all insights appreciated. :)

Cheers,
Florian
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com