Hi, all: We made a pr for rgw zone sync using multipart upload referring to cloud sync. Link: https://github.com/ceph/ceph/pull/21925 Here is the brief idea: Why multipart? 1. breakpoint resume. 2. concurrency for better performance. What changed? Actually, with the option rgw_sync_multipart_threshold=0, rgw will behave as before. If this option is set, for example, to be 32MB, objects larger than this will be synced in multipart way. Implements: The entrance for this feature is at RGWDefaultDataSyncModule::sync_object(), where a new coroutine called RGWDefaultHandleRemoteObjCR handles the sync logic, which is similar to RGWAWSHandleRemoteObjCR. This coroutine will decide whether a multipart sync or atomic sync will be used. Atomic sync will call RGWFetchRemoteObjCR which works the same way as before. For multipart sync, a coroutine called RGWFetchRemoteObjMultipartCR will be used. This coroutine executes in 5 steps: 1. compare mtime/zone_id/pg_ver between src and dest obj. 2. init a upload_id or load it from breakpoint status info obj. 3. do part upload( fetch remote by range and then write to rgw) concurrently. 4. complete the multipart upload. 5. remove breakpoint status info obj. Some of the coroutines mentioned above are implemented in rgw/rgw_sync_module_default.h/cc files, which are similar to rgw/rgw_sync_module_aws.h/cc files. Codes in step 2 to 4 are implemented in rgw/rgw_cr_rados.h/cc and rgw/rgw_rados.h/cc files. Each step has its own coroutine( abbr. for CR later), CR will send an async op to rados' async thread pool and the async op will call the new added RGWRados::xxx methods to do necessary work. Call stacks like this: RGWInitMultipartCR --> RGWAsyncInitMultipart --> RGWRados::init_multipart() RGWFetchRemoteObjMultipartPartCR --> RGWAsyncFetchRemoteObjMultipartPart --> RGWRados::fetch_remote_obj_multipart_part() RGWCompleteMultipartCR --> RGWAsyncCompleteMultipart --> RGWRados::complete_multipart() Unlike atomic sync, whose 'PUT' operation is executed in RGWHTTPManager's single thread context, multipart sync's 'PUT' operation for each part is executed in individual threads in the async thread pool. By this way, multiple parts can be uploaded concurrently and alleviate the workload of RGWHTTPManager. To achieve this target, a new ReciveCB callback class called RGWRadosPutObjMultipartPart is introduced. This new cb clas will copy the data received by RGWHTTPManager to another area, then write these data to disk in a synchronized way, and all these work is finished within the thread in thread pool, so RGWHTTPManager only acts as a stream pipe, same as cloud sync. PS: 1. The new add put processor works in the same way as RGWPutObjProcessor_Multipart except it doesn't require a req_state to initialize. 2. Function rgw_http_req_data::finish() is modified, adding client->signal() call. This modification targets to wake up RGWRadosPutObjMultipartPart's wait. Because there isn't a wake-up mechanism due to lack of a coroutine context. 3. Add RGWSimpleRadosRemoveCR which will remove the breakpoint status obj from BOTH disk and cache. The along used RGWRadosRemoveCR do not clear cache, and RGWSimpleReadCR will get cache first, so this pair CRs can not cooperate well. Note that there are a few places use this pair CRs, maybe we should fix that. Any advice will be appreciated. Thanks. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html