Re: RGW Multisite metadata sync init

Yehuda Sadeh-Weinraub <yehuda@xxxxxxxxxx> · Thu, 7 Sep 2017 22:27:32 +0300

On Thu, Sep 7, 2017 at 10:04 PM, David Turner <drakonstein@xxxxxxxxx> wrote:
> One realm is called public with a zonegroup called public-zg with a zone for
> each datacenter.  The second realm is called internal with a zonegroup
> called internal-zg with a zone for each datacenter.  they each have their
> own rgw's and load balancers.  The needs of our public facing rgw's and load
> balancers vs internal use ones was different enough that we split them up
> completely.  We also have a local realm that does not use multisite and a
> 4th realm called QA that mimics the public realm as much as possible for
> staging configuration stages for the rgw daemons.  All 4 realms have their
> own buckets, users, etc and that is all working fine.  For all of the
> radosgw-admin commands I am using the proper identifiers to make sure that
> each datacenter and realm are running commands on exactly what I expect them
> to (--rgw-realm=public --rgw-zonegroup=public-zg --rgw-zone=public-dc1
> --source-zone=public-dc2).
>
> The data sync issue was in the internal realm but running a data sync init
> and kickstarting the rgw daemons in each datacenter fixed the data
> discrepancies (I'm thinking it had something to do with a power failure a
> few months back that I just noticed recently).  The metadata sync issue is
> in the public realm.  I have no idea what is causing this to not sync
> properly since running a `metadata sync init` catches it back up to the
> primary zone, but then it doesn't receive any new users created after that.
>

Sounds like an issue with the metadata log in the primary master zone.
Not sure what could go wrong there, but maybe the master zone doesn't
know that it is a master zone, or it's set to not log metadata. Or
maybe there's a problem when the secondary is trying to fetch the
metadata log. Maybe some kind of # of shards mismatch (though not
likely).
Try to see if the master logs any changes: should use the
'radosgw-admin mdlog list' command.

Yehuda

> On Thu, Sep 7, 2017 at 2:52 PM Yehuda Sadeh-Weinraub <yehuda@xxxxxxxxxx>
> wrote:
>>
>> On Thu, Sep 7, 2017 at 7:44 PM, David Turner <drakonstein@xxxxxxxxx>
>> wrote:
>> > Ok, I've been testing, investigating, researching, etc for the last week
>> > and
>> > I don't have any problems with data syncing.  The clients on one side
>> > are
>> > creating multipart objects while the multisite sync is creating them as
>> > whole objects and one of the datacenters is slower at cleaning up the
>> > shadow
>> > files.  That's the big discrepancy between object counts in the pools
>> > between datacenters.  I created a tool that goes through for each bucket
>> > in
>> > a realm and does a recursive listing of all objects in it for both
>> > datacenters and compares the 2 lists for any differences.  The data is
>> > definitely in sync between the 2 datacenters down to the modified time
>> > and
>> > byte of each file in s3.
>> >
>> > The metadata is still not syncing for the other realm, though.  If I run
>> > `metadata sync init` then the second datacenter will catch up with all
>> > of
>> > the new users, but until I do that newly created users on the primary
>> > side
>> > don't exist on the secondary side.  `metadata sync status`, `sync
>> > status`,
>> > `metadata sync run` (only left running for 30 minutes before I ctrl+c
>> > it),
>> > etc don't show any problems... but the new users just don't exist on the
>> > secondary side until I run `metadata sync init`.  I created a new bucket
>> > with the new user and the bucket shows up in the second datacenter, but
>> > no
>> > objects because the objects don't have a valid owner.
>> >
>> > Thank you all for the help with the data sync issue.  You pushed me into
>> > good directions.  Does anyone have any insight as to what is preventing
>> > the
>> > metadata from syncing in the other realm?  I have 2 realms being sync
>> > using
>> > multi-site and it's only 1 of them that isn't getting the metadata
>> > across.
>> > As far as I can tell it is configured identically.
>>
>> What do you mean you have two realms? Zones and zonegroups need to
>> exist in the same realm in order for meta and data sync to happen
>> correctly. Maybe I'm misunderstanding.
>>
>> Yehuda
>>
>> >
>> > On Thu, Aug 31, 2017 at 12:46 PM David Turner <drakonstein@xxxxxxxxx>
>> > wrote:
>> >>
>> >> All of the messages from sync error list are listed below.  The number
>> >> on
>> >> the left is how many times the error message is found.
>> >>
>> >>    1811                     "message": "failed to sync bucket instance:
>> >> (16) Device or resource busy"
>> >>       7                     "message": "failed to sync bucket instance:
>> >> (5) Input\/output error"
>> >>      65                     "message": "failed to sync object"
>> >>
>> >> On Tue, Aug 29, 2017 at 10:00 AM Orit Wasserman <owasserm@xxxxxxxxxx>
>> >> wrote:
>> >>>
>> >>>
>> >>> Hi David,
>> >>>
>> >>> On Mon, Aug 28, 2017 at 8:33 PM, David Turner <drakonstein@xxxxxxxxx>
>> >>> wrote:
>> >>>>
>> >>>> The vast majority of the sync error list is "failed to sync bucket
>> >>>> instance: (16) Device or resource busy".  I can't find anything on
>> >>>> Google
>> >>>> about this error message in relation to Ceph.  Does anyone have any
>> >>>> idea
>> >>>> what this means? and/or how to fix it?
>> >>>
>> >>>
>> >>> Those are intermediate errors resulting from several radosgw trying to
>> >>> acquire the same sync log shard lease. It doesn't effect the sync
>> >>> progress.
>> >>> Are there any other errors?
>> >>>
>> >>> Orit
>> >>>>
>> >>>>
>> >>>> On Fri, Aug 25, 2017 at 2:48 PM Casey Bodley <cbodley@xxxxxxxxxx>
>> >>>> wrote:
>> >>>>>
>> >>>>> Hi David,
>> >>>>>
>> >>>>> The 'data sync init' command won't touch any actual object data, no.
>> >>>>> Resetting the data sync status will just cause a zone to restart a
>> >>>>> full sync
>> >>>>> of the --source-zone's data changes log. This log only lists which
>> >>>>> buckets/shards have changes in them, which causes radosgw to
>> >>>>> consider them
>> >>>>> for bucket sync. So while the command may silence the warnings about
>> >>>>> data
>> >>>>> shards being behind, it's unlikely to resolve the issue with missing
>> >>>>> objects
>> >>>>> in those buckets.
>> >>>>>
>> >>>>> When data sync is behind for an extended period of time, it's
>> >>>>> usually
>> >>>>> because it's stuck retrying previous bucket sync failures. The 'sync
>> >>>>> error
>> >>>>> list' may help narrow down where those failures are.
>> >>>>>
>> >>>>> There is also a 'bucket sync init' command to clear the bucket sync
>> >>>>> status. Following that with a 'bucket sync run' should restart a
>> >>>>> full sync
>> >>>>> on the bucket, pulling in any new objects that are present on the
>> >>>>> source-zone. I'm afraid that those commands haven't seen a lot of
>> >>>>> polish or
>> >>>>> testing, however.
>> >>>>>
>> >>>>> Casey
>> >>>>>
>> >>>>>
>> >>>>> On 08/24/2017 04:15 PM, David Turner wrote:
>> >>>>>
>> >>>>> Apparently the data shards that are behind go in both directions,
>> >>>>> but
>> >>>>> only one zone is aware of the problem.  Each cluster has objects in
>> >>>>> their
>> >>>>> data pool that the other doesn't have.  I'm thinking about
>> >>>>> initiating a
>> >>>>> `data sync init` on both sides (one at a time) to get them back on
>> >>>>> the same
>> >>>>> page.  Does anyone know if that command will overwrite any local
>> >>>>> data that
>> >>>>> the zone has that the other doesn't if you run `data sync init` on
>> >>>>> it?
>> >>>>>
>> >>>>> On Thu, Aug 24, 2017 at 1:51 PM David Turner <drakonstein@xxxxxxxxx>
>> >>>>> wrote:
>> >>>>>>
>> >>>>>> After restarting the 2 RGW daemons on the second site again,
>> >>>>>> everything caught up on the metadata sync.  Is there something
>> >>>>>> about having
>> >>>>>> 2 RGW daemons on each side of the multisite that might be causing
>> >>>>>> an issue
>> >>>>>> with the sync getting stale?  I have another realm set up the same
>> >>>>>> way that
>> >>>>>> is having a hard time with its data shards being behind.  I haven't
>> >>>>>> told
>> >>>>>> them to resync, but yesterday I noticed 90 shards were behind.
>> >>>>>> It's caught
>> >>>>>> back up to only 17 shards behind, but the oldest change not applied
>> >>>>>> is 2
>> >>>>>> months old and no order of restarting RGW daemons is helping to
>> >>>>>> resolve
>> >>>>>> this.
>> >>>>>>
>> >>>>>> On Thu, Aug 24, 2017 at 10:59 AM David Turner
>> >>>>>> <drakonstein@xxxxxxxxx>
>> >>>>>> wrote:
>> >>>>>>>
>> >>>>>>> I have a RGW Multisite 10.2.7 set up for bi-directional syncing.
>> >>>>>>> This has been operational for 5 months and working fine.  I
>> >>>>>>> recently created
>> >>>>>>> a new user on the master zone, used that user to create a bucket,
>> >>>>>>> and put in
>> >>>>>>> a public-acl object in there.  The Bucket created on the second
>> >>>>>>> site, but
>> >>>>>>> the user did not and the object errors out complaining about the
>> >>>>>>> access_key
>> >>>>>>> not existing.
>> >>>>>>>
>> >>>>>>> That led me to think that the metadata isn't syncing, while bucket
>> >>>>>>> and data both are.  I've also confirmed that data is syncing for
>> >>>>>>> other
>> >>>>>>> buckets as well in both directions. The sync status from the
>> >>>>>>> second site was
>> >>>>>>> this.
>> >>>>>>>
>> >>>>>>>   metadata sync syncing
>> >>>>>>>
>> >>>>>>>                 full sync: 0/64 shards
>> >>>>>>>
>> >>>>>>>                 incremental sync: 64/64 shards
>> >>>>>>>
>> >>>>>>>                 metadata is caught up with master
>> >>>>>>>
>> >>>>>>>       data sync source: f4c12327-4721-47c9-a365-86332d84c227
>> >>>>>>> (public-atl01)
>> >>>>>>>
>> >>>>>>>                         syncing
>> >>>>>>>
>> >>>>>>>                         full sync: 0/128 shards
>> >>>>>>>
>> >>>>>>>                         incremental sync: 128/128 shards
>> >>>>>>>
>> >>>>>>>                         data is caught up with source
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> Sync status leads me to think that the second site believes it is
>> >>>>>>> up
>> >>>>>>> to date, even though it is missing a freshly created user.  I
>> >>>>>>> restarted all
>> >>>>>>> of the rgw daemons for the zonegroup, but it didn't trigger
>> >>>>>>> anything to fix
>> >>>>>>> the missing user in the second site.  I did some googling and
>> >>>>>>> found the sync
>> >>>>>>> init commands mentioned in a few ML posts and used metadata sync
>> >>>>>>> init and
>> >>>>>>> now have this as the sync status.
>> >>>>>>>
>> >>>>>>>   metadata sync preparing for full sync
>> >>>>>>>
>> >>>>>>>                 full sync: 64/64 shards
>> >>>>>>>
>> >>>>>>>                 full sync: 0 entries to sync
>> >>>>>>>
>> >>>>>>>                 incremental sync: 0/64 shards
>> >>>>>>>
>> >>>>>>>                 metadata is behind on 70 shards
>> >>>>>>>
>> >>>>>>>                 oldest incremental change not applied: 2017-03-01
>> >>>>>>> 21:13:43.0.126971s
>> >>>>>>>
>> >>>>>>>       data sync source: f4c12327-4721-47c9-a365-86332d84c227
>> >>>>>>> (public-atl01)
>> >>>>>>>
>> >>>>>>>                         syncing
>> >>>>>>>
>> >>>>>>>                         full sync: 0/128 shards
>> >>>>>>>
>> >>>>>>>                         incremental sync: 128/128 shards
>> >>>>>>>
>> >>>>>>>                         data is caught up with source
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> It definitely triggered a fresh sync and told it to forget about
>> >>>>>>> what
>> >>>>>>> it's previously applied as the date of the oldest change not
>> >>>>>>> applied is the
>> >>>>>>> day we initially set up multisite for this zone.  The problem is
>> >>>>>>> that was
>> >>>>>>> over 12 hours ago and the sync stat hasn't caught up on any shards
>> >>>>>>> yet.
>> >>>>>>>
>> >>>>>>> Does anyone have any suggestions other than blast the second site
>> >>>>>>> and
>> >>>>>>> set it back up with a fresh start (the only option I can think of
>> >>>>>>> at this
>> >>>>>>> point)?
>> >>>>>>>
>> >>>>>>> Thank you,
>> >>>>>>> David Turner
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> ceph-users mailing list
>> >>>>> ceph-users@xxxxxxxxxxxxxx
>> >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>>>>
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> ceph-users mailing list
>> >>>>> ceph-users@xxxxxxxxxxxxxx
>> >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>>>
>> >>>>
>> >>>> _______________________________________________
>> >>>> ceph-users mailing list
>> >>>> ceph-users@xxxxxxxxxxxxxx
>> >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>>>
>> >
>> > _______________________________________________
>> > ceph-users mailing list
>> > ceph-users@xxxxxxxxxxxxxx
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com