Re: rgw-multisite: why do we need a extra CR for out-band data process in RGWDataSyncShardCR?

Casey Bodley <cbodley@xxxxxxxxxx> · Tue, 31 Jul 2018 17:54:15 -0400

On 07/30/2018 10:11 PM, Xinying Song wrote:
Hi, cephers:

   I'm really confused by the processing logic for out-band data in
rgw-multisite. Why do we spawn a new RGWDataSyncSingleEntryCR op
instead of relying on the normal spawned RGWDataSyncSingleEntryCR
which is responsible for updating the sync marker? In my opinion, even
we don't spawn that extra RGWDataSyncSingleEntryCR for each out band
entry, the normal spawned CR will also achieve the sync process. One
drawback for this extra CR is sync marker will be updated before data
is synchronized, because extra CR will get the lock first, and normal
spawned CR just fall through to update marker.

   So, what's the purpose for this extra CR? Why not remove it?

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

These out-of-band updates are intended to help in cases where data sync 
is far behind in processing the data-changes log. By prioritizing the 
buckets with recent changes, sync can appear more responsive even though 
many buckets are still behind.

Though as you've pointed out, when data sync -is- up-to-date they cause 
us to spawn multiple RGWDataSyncSingleEntryCRs for the same bucket - 
once for the out-of-band update, and again when we read it from the head 
of the data-changes log.

While it's not good that we update the marker before the first CR 
completes, it isn't necessarily a bug in correctness or crash-safety. In 
this case, the second CR will have written the EBUSY error to the 
error_repo first, so the bucket sync would still be retried in the case 
of a crash.

But these duplicate CRs are still wasteful and we should fix that! I did 
some work in this direction a couple years ago with 
https://github.com/ceph/ceph/pull/10615, but it didn't get enough 
testing/validation to merge. The strategy there was to register these 
out-of-band updates and error_repo retries with the marker_tracker, and 
use that to detect when we're trying to register a duplicate. Then if we 
start bucket sync without a log marker (ie from out-of-band or 
error_repo) and later try again with a valid marker, the initial bucket 
sync CR will pass that marker to marker_tracker->finish() once it completes.

Does that make sense? Would you be interested in reviving that project?

Thanks,
Casey
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html