Re: Revisit Large OMAP Objects

by morphin <morphinwithyou@xxxxxxxxx> · Thu, 15 Apr 2021 03:01:48 +0300

I've same issue and joined to the club.
Almost every deleted bucket is still there due to multisite. Also I've
removed secondary zone and stopped sync but these stale-instance's still
there.
Before adding new secondary zone I want to remove them. If you gonna run
anything let me know please.

<DHilsbos@xxxxxxxxxxxxxx> adresine sahip kullanıcı 14 Nis 2021 Çar, 21:20
tarihinde şunu yazdı:

> Casey;
>
> That makes sense, and I appreciate the explanation.
>
> If I were to shut down all uses of RGW, and wait for replication to catch
> up, would this then address most known issues with running this command in
> a multi-site environment?  Can I offline RADOSGW daemons as an added
> precaution?
>
> Thank you,
>
> Dominic L. Hilsbos, MBA
> Director – Information Technology
> Perform Air International Inc.
> DHilsbos@xxxxxxxxxxxxxx
> www.PerformAir.com
>
>
> -----Original Message-----
> From: Casey Bodley [mailto:cbodley@xxxxxxxxxx]
> Sent: Wednesday, April 14, 2021 9:03 AM
> To: Dominic Hilsbos
> Cc: k0ste@xxxxxxxx; ceph-users@xxxxxxx
> Subject: Re:  Re: Revisit Large OMAP Objects
>
> On Wed, Apr 14, 2021 at 11:44 AM <DHilsbos@xxxxxxxxxxxxxx> wrote:
> >
> > Konstantin;
> >
> > Dynamic resharding is disabled in multisite environments.
> >
> > I believe you mean radosgw-admin reshard stale-instances rm.
> >
> > Documentation suggests this shouldn't be run in a multisite
> environment.  Does anyone know the reason for this?
>
> say there's a bucket with 10 objects in it, and that's been fully
> replicated to a secondary zone. if you want to remove the bucket, you
> delete its objects then delete the bucket
>
> when the bucket is deleted, rgw can't delete its bucket instance yet
> because the secondary zone may not be caught up with sync - it
> requires access to the bucket instance (and its index) to sync those
> last 10 object deletions
>
> so the risk with 'stales-instances rm' in multisite is that you might
> delete instances before other zones catch up, which can lead to
> orphaned objects
>
> >
> > Is it, in fact, safe, even in a multisite environment?
> >
> > Thank you,
> >
> > Dominic L. Hilsbos, MBA
> > Director – Information Technology
> > Perform Air International Inc.
> > DHilsbos@xxxxxxxxxxxxxx
> > www.PerformAir.com
> >
> >
> > -----Original Message-----
> > From: Konstantin Shalygin [mailto:k0ste@xxxxxxxx]
> > Sent: Wednesday, April 14, 2021 12:15 AM
> > To: Dominic Hilsbos
> > Cc: ceph-users@xxxxxxx
> > Subject: Re:  Revisit Large OMAP Objects
> >
> > Run reshard instances rm
> > And reshard your bucket by hand or leave dynamic resharding process to
> do this work
> >
> >
> > k
> >
> > Sent from my iPhone
> >
> > > On 13 Apr 2021, at 19:33, DHilsbos@xxxxxxxxxxxxxx wrote:
> > >
> > > All;
> > >
> > > We run 2 Nautilus clusters, with RADOSGW replication (14.2.11 -->
> 14.2.16).
> > >
> > > Initially our bucket grew very quickly, as I was loading old data into
> it and we quickly ran into Large OMAP Object warnings.
> > >
> > > I have since done a couple manual reshards, which has fixed the
> warning on the primary cluster.  I have never been able to get rid of the
> issue on the cluster with the replica.
> > >
> > > I prior conversation on this list led me to this command:
> > > radosgw-admin reshard stale-instances list --yes-i-really-mean-it
> > >
> > > The results of which look like this:
> > > [
> > >    "nextcloud-ra:f91aeff8-a365-47b4-a1c8-928cd66134e8.185262.1",
> > >    "nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.6",
> > >    "nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.2",
> > >    "nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.5",
> > >    "nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.4",
> > >    "nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.3",
> > >    "nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.1",
> > >    "3520ae821f974340afd018110c1065b8/OS
> Development:f91aeff8-a365-47b4-a1c8-928cd66134e8.4298264.1",
> > >
> "10dfdfadb7374ea1ba37bee1435d87ad/volumebackups:f91aeff8-a365-47b4-a1c8-928cd66134e8.4298264.2",
> > >    "WorkOrder:f91aeff8-a365-47b4-a1c8-928cd66134e8.44130.1"
> > > ]
> > >
> > > I find this particularly interesting, as nextcloud-ra, <swift>/OS
> Development, <swift>/volumbackups, and WorkOrder buckets no longer exist.
> > >
> > > When I run:
> > > for obj in $(rados -p 300.rgw.buckets.index ls | grep
> f91aeff8-a365-47b4-a1c8-928cd66134e8.3512190.1);   do   printf "%-60s
> %7d\n" $obj $(rados -p 300.rgw.buckets.index listomapkeys $obj | wc -l);
>  done
> > >
> > > I get the expected 64 entries, with counts around 20000 +/- 1000.
> > >
> > > Are the above listed stale instances ok to delete?  If so, how do I go
> about doing so?
> > >
> > > Thank you,
> > >
> > > Dominic L. Hilsbos, MBA
> > > Director - Information Technology
> > > Perform Air International Inc.
> > > DHilsbos@xxxxxxxxxxxxxx
> > > www.PerformAir.com
> > >
> > > _______________________________________________
> > > ceph-users mailing list -- ceph-users@xxxxxxx
> > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx