(cc Yehuda and dev list) On Fri, Apr 15, 2022 at 5:33 AM Edgelong Voodu <1070443499cs@xxxxxxxxx> wrote: > > hi, Casey: > I want to implement the ExistingObjectReplication which has not been implemented yet in rgw-multiste bucket-granularity sync currently. (see https://docs.aws.amazon.com/zh_cn/AmazonS3/latest/API/API_ExistingObjectReplication.html) cool! the existing behavior corresponds to ExistingObjectReplication=Enabled, so this discussion is about adding the Disabled case > There is a key questions about this feature, that is how can i identify which object created before or after PUT the bucket replication configuration? > there are some of my thoughts: > 1) because of the clock skew, i don't think it is a good idea to compare the m_time between the object and replication configuration. If we compare the m_time, maybe will miss some object sync or sync the wrong m_time object. > 2) some old bi log may be trimmed, so not every object has it's own bilog entry correspond, we can get the latest marker of bilog when execute PutBucketReplication, but what about the rest object (no bilog marker for them)? > > would you provide some advise or any idea about this feature? > thank you . > this sounds complicated for two main reasons: * the consistency model for metadata and data are completely separate. so if bucket sync needs to look at a timestamp in its bucket metadata, it has no way to know whether that's the *latest* version of the bucket metadata. i think this is an issue with bucket replication policy in general * the interaction between 'bucket full sync', 'bucket incremental sync', and bilog trimming. as you said in 2) above, we may trim bilogs that other zones haven't processed yet, because we assume those changes would be covered by a 'bucket full sync' disabling ExistingObjectReplication sounds a lot like skipping the 'bucket full sync' step. but for this to work correctly, a) 'bucket incremental sync' would need to know where in the bilogs to start so that it only sees the events that happened after the 'disable', and b) we'd need to prevent those bilog entries from being trimmed for 'a)', the metadata master zone handling the PutBucketReplication op could record its own bilog markers, but it doesn't know the current markers on other zones - and active-active sync would require those markers too. for 'b)', it's probably not desirable to leave untrimmed entries around like this in the end, it may be better to keep the existing structure of bucket full/incremental sync, but filter everything based on mtime as you suggest in '1)' above. that may not be perfect in the presence of time skew, but skew is already a factor in sync - all we can promise is that every zone would make the same decisions and end up with the same result we'd also need to consider what happens when PutBucketReplication changes the value of ExistingObjectReplication after other zones have made it through full sync. if it changes from Disabled->Enabled, each zone would have to restart a 'bucket full sync' to catch anything it missed last time. there's some precedent for this (restarting a bucket full sync) in `radosgw-admin bucket sync enable`, but that's built into data sync itself. i don't think there's a good way for metadata sync to trigger that from the outside _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx