As a matter of fact, yes! The sync was falling way behind on one shard in the data sync, I had to close down the rgw's in the second site to keep the rgw's in the master from allocating all the memory and then getting killed by oom_killer. Not sure why the shard won't get synced, but my guess is that it's due to a delete operation in the oversized bucket as there was a long running rm command that was aborted after being run for over 7 days. Is there a way to manually resync the failing site? As it would seem I cannot reshard the bucket until the sites are properly connected again? Thanks! /andreas On 29 May 2017 at 07:55, Василий Ангапов <angapov@xxxxxxxxx> wrote: > I have almost the same problem except that "bucket reshard" gives me > "(5) Input/output error" (Red Hat Ceph Storage 2.2 or Ceph 10.2.5). > Had the discussion with Red Hat Support and they told me that it is > related to malfunctioning RGW multisite replication. Do you have > multisite configuration? > > Regards, Vasily > > 2017-05-26 23:28 GMT+03:00 Andreas Calminder <andreas.calminder@xxxxxxxxxx>: >> Hi, >> Posted this in ceph-users earlier, thought I try here as well. Running >> Jewel (10.2.7). While trying to get rid of an oversized bucket (+14M >> objects) I tried to reshard the bucket index to be able to remove it >> without having the rgw run out of memory. >> >> As per the Red Hat documentation I ran >> # radosgw-admin bucket reshard --bucket=oversized_bucket --num-shards=300 >> Noted the old instance id and waited for it to output a count of all >> items, at the very end the command spits out "ERROR: bi_list(): (4) >> Interrupted system call" >> >> Now I have the new bucket instance with a sharded index (300), >> seemingly unused and the old instance id of the bucket with no shards, >> which seems to be active >> >> # radosgw-admin --cluster drceph-tcs-prod metadata get >> bucket:oversized_bucket returns the old instance id in bucket_id >> >> Two questions: >> >> * How do I remove the new bucket id, from the failed reshard command. >> Since it's not used it's confusing to have it floating around >> * How do I actually reshard the oversized_bucket? - Actually, I really >> don't care about the bucket, If there's a way to remove the bucket and >> it's objects without altering the index, causing the radosgw to >> allocate all memory available and crash, I'd rather do that. >> >> Regards, >> Andreas >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- Andreas Calminder System Administrator IT Operations Core Services Klarna AB (publ) Sveavägen 46, 111 34 Stockholm Tel: +46 8 120 120 00 Reg no: 556737-0431 klarna.com -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html