[RGW] Too much index objects and OMAP keys on them

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

We see large omap objects warnings on the RGW bucket index pool.
The objects OMAP keys are about objects in one identified big bucket.

Context :
=========
We use S3 storage for an application, with ~1,5 M objects.

The production cluster is "replicated" with rclone cron jobs on another distant cluster.

We have for the moment only one big bucket (23 shards), but we work on a multi-bucket solution.
The problem is not here.

One other important information : the bucket is versioned. We don't really have versions or deleted markers due to the way the application works. It's mainly a way for recovery as we don't have backups, due to the expected storage volume. Versioning + replication should solve most of the restoration use cases.


First, we don't have large omap objects in the production cluster, only on the replicated / backup one.

Differences between the two clusters :
- production is a 5 nodes cluster with SSD for rocksdb+wal, 2To SCSI 10k in RAID0 + battery backed cache. - backup cluster is a 13 nodes cluster without SSD? only 8To HDD with direct HBA

Both clusters use Erasure Coding for the RGW buckets data pool. (3+2 on the production one, 8+2 on the backup one).

Firsts seen facts :
===================

Both cluster have the same number of S3 objects in the main bucket.
I've seen that there is 10x more objects in the RGW buckets index pool in the prod cluster than in the backup cluster.
On these objects, there is 4x more OMAP keys in the backup cluster.

Example :
With rados ls :
- 311 objects in defaults.rgw.buckets.index (prod cluster)
- 3157 objects in MRS4.rgw.buckets.index (backup cluster)

In the backup cluster, we have 22 objects with more than 200000 OMAP keys, that's why we have a Warning. Searching in the production cluster, I can see around 60000 OMAP keys max on objects.

Root Cause ?
============

It seems we have too much OMAP keys and even too much objects in the index pool of our backup cluster. But Why ? And how to remove the orphans ?

I've already tried :
- radosgw-admin bucket check --fix -check-objects (still running)
- rgw-orphan-list (but was interrupted last night after 5 hours)

As I understand, the last script will do the reverse of what I need : show objects that don't have indexes pointing on it ? The radosgw-admin bucket check will perhaps rebuild indexes, but will it remove unused ones ?

Workaround ?
============

How can I get rid of the unused index objects and omap keys ?
Of course, I can add more reshards, but I think it would be better to solve the root cause if I can.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux