On Thu, Mar 18, 2021 at 8:55 AM WeiGuo Ren <rwg1335252904@xxxxxxxxx> wrote: > > radosgw-admin sync error list > [ > { > "shard_id": 0, > "entries": [ > { > "id": "1_1614333890.956965_8080774.1", > "section": "data", > "name": "user21-bucket23:multi_master-anna.1827103.323:54", > "timestamp": "2021-02-26 10:04:50.956965Z", > "info": { > "source_zone": "multi_master-anna", > "error_code": 125, > "message": "failed to sync bucket instance: (125) > Operation canceled" > } > } > ] > } > ] > > I think this command should be used to determine its parameters, and > keep increasing, as long as -ECANCLE(125) does not appear, it is > appropriate. > > WeiGuo Ren <rwg1335252904@xxxxxxxxx> 于2021年3月18日周四 下午7:37写道: > > > > I have an osd ceph cluster, rgw instance often appears to be renewed > > and not locked > > > > WeiGuo Ren <rwg1335252904@xxxxxxxxx> 于2021年3月18日周四 下午7:35写道: > > > > > > In an rgw multi-site production environment, how many rgw instances > > > will be started in a single zone? it depends on the scale, but i'd guess anywhere from 2-8? if the zone is serving clients (not just DR), it can make sense to dedicate some of the rgws to clients (by setting rgw_run_sync_thread=0, and not including their endpoints in the zone configuration), and others just to sync. so i think it's easy enough to control how many gateways are contending for these leases you can raise the lease period, but that means it will take longer for sync to recover from a radosgw shutdown/restart. the shard locks it held will take longer to expire, preventing other gateways from resuming sync on those shards > > > According to my test, multiple rgw > > > instances will compete for the datalog leaselock, and it is very > > > likely that the leaselock will not be renewed. Is the default > > > rgw_sync_lease_period=120s a bit small? >