Re: rgw: Is rgw_sync_lease_period=120s set small?

Casey Bodley <cbodley@xxxxxxxxxx> · Thu, 18 Mar 2021 13:09:37 -0400

On Thu, Mar 18, 2021 at 8:55 AM WeiGuo Ren <rwg1335252904@xxxxxxxxx> wrote:
>
> radosgw-admin sync error list
> [
>     {
>         "shard_id": 0,
>         "entries": [
>             {
>                 "id": "1_1614333890.956965_8080774.1",
>                 "section": "data",
>                 "name": "user21-bucket23:multi_master-anna.1827103.323:54",
>                 "timestamp": "2021-02-26 10:04:50.956965Z",
>                 "info": {
>                     "source_zone": "multi_master-anna",
>                     "error_code": 125,
>                     "message": "failed to sync bucket instance: (125)
> Operation canceled"
>                 }
>             }
>         ]
>      }
> ]
>
> I think this command should be used to determine its parameters, and
> keep increasing, as long as -ECANCLE（125） does not appear, it is
> appropriate.
>
> WeiGuo Ren <rwg1335252904@xxxxxxxxx> 于2021年3月18日周四 下午7:37写道：
> >
> > I have an osd ceph cluster, rgw instance often appears to be renewed
> > and not locked
> >
> > WeiGuo Ren <rwg1335252904@xxxxxxxxx> 于2021年3月18日周四 下午7:35写道：
> > >
> > > In an rgw multi-site production environment, how many rgw instances
> > > will be started in a single zone?

it depends on the scale, but i'd guess anywhere from 2-8?

if the zone is serving clients (not just DR), it can make sense to
dedicate some of the rgws to clients (by setting
rgw_run_sync_thread=0, and not including their endpoints in the zone
configuration), and others just to sync. so i think it's easy enough
to control how many gateways are contending for these leases

you can raise the lease period, but that means it will take longer for
sync to recover from a radosgw shutdown/restart. the shard locks it
held will take longer to expire, preventing other gateways from
resuming sync on those shards

> > > According to my test, multiple rgw
> > > instances will compete for the datalog leaselock, and it is very
> > > likely that the leaselock will not be renewed. Is the default
> > > rgw_sync_lease_period=120s a bit small?
>