[RGW] multisite sync, stall recovering shards

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi there !
 
Being suspicious about replication between two clusters,
I've done a radosgw-admin data sync init on the secondary zone.    
Since then, after a lot of activity, I'm stuck with recovering     
shards, nothing moves. Incremental sync still work.
Wondering if I had a bad state also on the primary, I also did a data sync init on
the primary...
And now, it's also stuck with recovering shard !                                                                  
 
In sync error list, I can find some "failed to sync bucket         
instance: (125) Operation canceled" errors                         
 
I also tried to rewrite some buckets chown in those errors, but   
nothing changes. Strange, in those errors, the objects names are   
not real objects, example : "name": "replic_cfn_rec/cfb0047:aefd400
3-1866-4b16-b1b3-2f308075cd1c.20298566.4:11[0]"                    
 
I wonder what is this ending ":10[0]".
 
I also tried to remove stale instances, but nothing.
I've still not retry a data sync init on secondary, perhaps I should, but the generated activity is impactfull.
 
Can we reduce that resync activity priority ?
 
ah, My primary cluster is on Reef 18.2.4, the secondary still on 18.2.2 (needs OS upgrade, Ubuntu 18.04).
 
--  
Gilles

 
--  
Gilles
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux