Hello,
We see large omap objects warnings on the RGW bucket index pool.
The objects OMAP keys are about objects in one identified big bucket.
Context :
=========
We use S3 storage for an application, with ~1,5 M objects.
The production cluster is "replicated" with rclone cron jobs on another
distant cluster.
We have for the moment only one big bucket (23 shards), but we work on a
multi-bucket solution.
The problem is not here.
One other important information : the bucket is versioned. We don't
really have versions or deleted markers due to the way the application
works. It's mainly a way for recovery as we don't have backups, due to
the expected storage volume. Versioning + replication should solve most
of the restoration use cases.
First, we don't have large omap objects in the production cluster, only
on the replicated / backup one.
Differences between the two clusters :
- production is a 5 nodes cluster with SSD for rocksdb+wal, 2To SCSI 10k
in RAID0 + battery backed cache.
- backup cluster is a 13 nodes cluster without SSD? only 8To HDD with
direct HBA
Both clusters use Erasure Coding for the RGW buckets data pool. (3+2 on
the production one, 8+2 on the backup one).
Firsts seen facts :
===================
Both cluster have the same number of S3 objects in the main bucket.
I've seen that there is 10x more objects in the RGW buckets index pool
in the prod cluster than in the backup cluster.
On these objects, there is 4x more OMAP keys in the backup cluster.
Example :
With rados ls :
- 311 objects in defaults.rgw.buckets.index (prod cluster)
- 3157 objects in MRS4.rgw.buckets.index (backup cluster)
In the backup cluster, we have 22 objects with more than 200000 OMAP
keys, that's why we have a Warning.
Searching in the production cluster, I can see around 60000 OMAP keys
max on objects.
Root Cause ?
============
It seems we have too much OMAP keys and even too much objects in the
index pool of our backup cluster. But Why ? And how to remove the
orphans ?
I've already tried :
- radosgw-admin bucket check --fix -check-objects (still running)
- rgw-orphan-list (but was interrupted last night after 5 hours)
As I understand, the last script will do the reverse of what I need :
show objects that don't have indexes pointing on it ?
The radosgw-admin bucket check will perhaps rebuild indexes, but will it
remove unused ones ?
Workaround ?
============
How can I get rid of the unused index objects and omap keys ?
Of course, I can add more reshards, but I think it would be better to
solve the root cause if I can.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx