the issue of rgw sync concurrency

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We found some unnormal status of sync status when running multiple
rgw(15.2.14) multisize sync, then I dig into the codes about rgw sync.
I think there is a issues of rgw sync concurrency implementation if I
understand correctly.
The implementation of  the critical process which we want it run once,
which steps are as below:
1. read shared status object
2. check status
3. lock status
4. critical process
5. store status
6. unlock

It is a problem in concurrent case that  the critical process would
run multiple times because it uses old status, thus it makes no sense.
The steps should be as below
1. read shared status object
2. check status
3. lock status
4. read and check status again
5. critical process
6. store status
7. unlock

one example as below
do {
r = run(new RGWReadSyncStatusCoroutine(&sync_env, &sync_status));
if (r < 0 && r != -ENOENT) {
tn->log(0, SSTR("ERROR: failed to fetch sync status r=" << r));
return r;
}

switch ((rgw_meta_sync_info::SyncState)sync_status.sync_info.state) {
case rgw_meta_sync_info::StateBuildingFullSyncMaps:
tn->log(20, "building full sync maps");
r = run(new RGWFetchAllMetaCR(&sync_env, num_shards,
sync_status.sync_markers, tn));

And there is no deletion of omapkeys after finishing sync entry in
full_sync process, thus full_sync would run multiple times in
concurrent case.

It has  no importance impact on data sync because bucket syncing is
idempotence,but no metadata sync




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux