rgw-multisite: add multipart sync for rgw zones

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, all:

We made a pr for rgw zone sync using multipart upload referring to cloud
sync.
Link: https://github.com/ceph/ceph/pull/21925
Here is the brief idea:

Why multipart?
   1. breakpoint resume.
   2. concurrency for better performance.

What changed?
   Actually, with the option rgw_sync_multipart_threshold=0, rgw will behave
as before. If this option is set, for example, to be 32MB, objects larger
than this will be synced in multipart way.

Implements:
   The entrance for this feature is at
RGWDefaultDataSyncModule::sync_object(), where a new coroutine called
RGWDefaultHandleRemoteObjCR handles the sync logic, which is similar to
RGWAWSHandleRemoteObjCR. This coroutine will decide whether a multipart
sync or atomic sync will be used. Atomic sync will call RGWFetchRemoteObjCR
which works the same way as before.

   For multipart sync, a coroutine called RGWFetchRemoteObjMultipartCR will
be used. This coroutine executes in 5 steps:
   1. compare mtime/zone_id/pg_ver between src and dest obj.
   2. init a upload_id or load it from breakpoint status info obj.
   3. do part upload( fetch remote by range and then write to rgw)
concurrently.
   4. complete the multipart upload.
   5. remove breakpoint status info obj.

   Some of the coroutines mentioned above are implemented in
rgw/rgw_sync_module_default.h/cc files, which are similar to
rgw/rgw_sync_module_aws.h/cc files.

   Codes in step 2 to 4 are implemented in rgw/rgw_cr_rados.h/cc and
rgw/rgw_rados.h/cc files. Each step has its own coroutine( abbr. for CR
later), CR will send an async op to rados' async thread pool and the async
op will call the new added RGWRados::xxx methods to do necessary work. Call
stacks like this:
   RGWInitMultipartCR --> RGWAsyncInitMultipart -->
RGWRados::init_multipart()
   RGWFetchRemoteObjMultipartPartCR --> RGWAsyncFetchRemoteObjMultipartPart
--> RGWRados::fetch_remote_obj_multipart_part()
   RGWCompleteMultipartCR --> RGWAsyncCompleteMultipart -->
RGWRados::complete_multipart()

   Unlike atomic sync, whose 'PUT' operation is executed in RGWHTTPManager's
single thread context, multipart sync's 'PUT' operation for each part is
executed in individual threads in the async thread pool. By this way,
multiple parts can be uploaded concurrently and alleviate the workload of
RGWHTTPManager. To achieve this target, a new ReciveCB callback class
called RGWRadosPutObjMultipartPart is introduced. This new cb clas will
copy the data received by RGWHTTPManager to another area, then write these
data to disk in a synchronized way, and all these work is finished within
the thread in thread pool, so RGWHTTPManager only acts as a stream pipe,
same as cloud sync.

PS:
1. The new add put processor works in the same way as
RGWPutObjProcessor_Multipart except it doesn't require a req_state to
initialize.
2. Function rgw_http_req_data::finish() is modified, adding
client->signal() call. This modification targets to wake up
RGWRadosPutObjMultipartPart's wait. Because there isn't a wake-up mechanism
due to lack of a coroutine context.
3. Add RGWSimpleRadosRemoveCR which will remove the breakpoint status obj
from BOTH disk and cache. The along used RGWRadosRemoveCR do not clear
cache, and RGWSimpleReadCR will get cache first, so this pair CRs can not
cooperate well. Note that there are a few places use this pair CRs, maybe
we should fix that.

Any advice will be appreciated. Thanks.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux