Fixing a broken bucket index in RGW

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm looking for some help in fixing a bucket index on a Luminous (12.2.8)

cluster running on FileStore.

 

First some background on how I believe the bucket index became broken.  Last

month we had a PG in our .rgw.buckets.index pool become inconsistent:

 

2018-12-11 09:12:17.743983 osd.1879 osd.1879 10.36.173.147:6820/60041 16 : cluster [ERR] 7.8e : soid 7:717333b6:::.dir.default.1110451812.43.2:head omap_digest 0x59e4f686 != omap_digest 0x37b99ba6 from shard 1879

 

We then attempted to repair the PG by using 'ceph pg repair 7.8e', but I

have a feeling the primary copy must have been corrupt (later that day I

learned about 'rados list-inconsistent-obj 7.8e -f json-pretty').  The

repair resulted in an unfound object:

 

2018-12-11 09:32:02.651241 osd.1753 osd.1753 10.32.12.32:6820/3455358 13 : cluster [ERR] 7.8e push 7:717333b6:::.dir.default.1110451812.43.2:head v 767605'30158112 failed because local copy is 767605'30158924

 

A couple hours later we started getting reports of 503s from multiple

customers.  Believing that the unfound object was the cause of the problem

we used the 'mark_unfound_lost revert' option to roll back to the previous

version:

 

ceph pg 7.8e mark_unfound_lost revert

 

This fixed the cluster, but broke the bucket.

 

Attempting to list the bucket contents results in:

 

[root@p3cephrgw007 ~]# radosgw-admin bucket list --bucket=backups.579

ERROR: store->list_objects(): (2) No such file or directory

 

 

This bucket appears to have been automatically sharded after we upgraded to

Luminous, so we do have an old bucket instance available (but it's too old

to be very helpful):

 

[root@p3cephrgw007 ~]# radosgw-admin metadata list bucket.instance |grep backups.579

    "backups.579:default.1110451812.43",

    "backups.579:default.28086735.566138",

 

 

Looking for for all the shards based on the name only pulls up the first 2

shards:

 

[root@p3cephrgw007 ~]# rados -p .rgw.buckets.index ls | grep "default.1110451812.43"

...

.dir.default.1110451812.43.0

...

.dir.default.1110451812.43.1

...

 

 

But the bucket metadata says there should be three:

 

[root@p3cephrgw007 ~]# radosgw-admin metadata get bucket.instance:backups.579:default.1110451812.43 | jq -r '.data.bucket_info.num_shards'

3

 

 

If we look in the log message above it said .dir.default.1110451812.43.2 was

the rados object that was slightly newer, so the revert command we ran must

have removed it instead of rolling it back to the previous version.

 

This leaves me with some questions:

 

What would have been the better way for dealing with this problem when the

whole cluster stopped working?

 

Is there a way to recreate the bucket index?  I see a couple options in the

docs for fixing the bucket index (--fix) and for rebuilding the bucket index

(--check-objects), but I don't see any explanations on how it goes about

doing that.  Will it attempt to scan all the objects in the cluster to

determine which ones belong in this bucket index?  Will the missing shard be

ignored and the fixed bucket index be missing 1/3rd of the objects?

 

Thanks,

Bryan

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux