Re: rgw object versioning review

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Yehuda,
It's great to see the object versioning support in ceph.

Thanks
Swami

On Tue, Dec 16, 2014 at 12:56 PM, Yehuda Sadeh <yehuda@xxxxxxxxxx> wrote:
> I squashed the commits that I originally marked for squashing, and
> everything is pushed into wip-rgw-versioning-2.
> Following is a rough breakdown of the different development phases,
> and what I (think I) achieved in each one, hopefully it would give you
> some hint of the direction when reviewing. Note that I'm still missing
> some multi-zone related changes, and there are a couple of known
> regressions, but that should stop you.
>
> See http://wiki.ceph.com/Development/RGW_Object_Versioning for design doc.
>
>
> 1. Initial work
>
> In this initial phase, the idea was to add a version (or instance)
> identifier to objects. This allows creating multiple objects with the
> same name, but different instance ids. Objects can be accessed
> directly by using their names + version. The bucket index holds
> entries by name + instance. Versioned objects here are just regular
> objects that are named differently.
>
> 5a95240 - rgw: add versioning_enabled field to bucket info
> d4aa2ae - rgw: get bucket versioning status op
> 4d7ffbd - rgw: restful op to set bucket versioning
> 38275cb - rgw: enable s3 get/set versioning ops
> 655ae55 - rgw, cls_rgw: add accounted_size for object metadata entry
> ae13500 - rgw: remove unused code
> 727b9e8 - rgw: decouple object name from index representation
> ef3982f - rgw: rename rgw_obj::key to rgw_obj::loc
> d603ddf - rgw, cls_rgw: various datastructures use new rgw_obj_key
> c44e574 - rgw: rename cls_rgw_obj::key to cls_rgw_obj::loc
> e3dc560 - cls_rgw: change data structures to keep single object key structure
> 3e9177e - rgw: adapt to new objclass interface
> ad9f460 - radosgw-admin: adapt to new interfaces
> e81ef5f - test: cls_rgw fixes
> 3bfa127 - rgw: clean up some locator use
> d242339 - radosgw-admin: some commands use object_version param
> c97a05a - rgw: generate random instance id
> 162c53b - rgw: interface adjustment following a rebase
> 8ecef67 - rgw: remove old unused code
>
> 2. OLH
>
> Started olh logic implementation. Adding new cls/rgw calls:
>  - link olh
>  - unlink instance
>  - read olh log
>  - trim olh log
>
> New nucket index representation took a couple of iterations to settle.
> Major issue that was discovered mid-process was that versioned objects
> need to be sorted from newest to oldest when listing bucket.
> We now have 3 different kind of entries in the bucket index:
>  - plain
>  - instance
>  - olh
>
> Plain entries are entries that repesent the objects' listing order.
> These entries are named as follows (for versioned objects, non
> versioned objects are treated as before):
> <name> \0 <decreasing_str(olh_epoch)> \0 <instance id>
>
> The instance entries reside in a different namespace, and objects are
> indexed there by their name and their version id. Thus, in order to
> get to the listing entries, we need to first read the instance entry.
>
> The olh entry contains the olh log, and the current olh epoch.
>
>
> Extra complexity: handling null-versioned objects (objects that are
> created on buckets with suspended versioning). Main changes: added a
> new entry to the bucket index for each versioned object to mark it as
> versioned (indexed only by its name). Every regular object that is
> overwritten is converted to a versioned object.
> Also, needed to co-locate olh and data objects, since objects can be
> versioned, but also have a 'null' version that needs to match the
> non-versioned case. So, an olh can point at itself, and when removing
> an object, we make sure that we don't remove the olh.
>
>
>
> 27a9408 - rgw: initial olh implementation
> 105c4e0 - rgw: gen rand lower alphanumberic string
> f6a21cd - rgw: adjust return code when generating random strings
> e65a1fa - rgw: gen rand lowercase string (stl string version)
> cc55d9e - rgw: init olh tag
> d01b4c7 - rgw: some code cleanup
> 8ae3c3c - rgw: obj_stat() follows on olh
> 932e3b1 - cls_rgw: prepare groundwork for olh
> 4211f28 - cls_rgw: encode / decode obj and list index keys
> 3ed6627 - cls_rgw: bucket index link olh
> fc2f127 - cls_rgw: object instance olh linking
> 4ac79e5 - rgw: bucket index link olh interface
> 56185d6 - rgw: new api to retrieve olh log
> 9f8f158 - rgw: implement rgw_bucket_olh_log_entry::dump()
> c92a332 - rgw-admin: add olh readlog command
> 717e7ec - cls_rgw: olh init op
> 946611c - rgw: apply olh log functionality
> d6d2f58 - rgw, cls_rgw: trim olh log functionality
> 4f9afa7 - rgw: olh atomicity groundwork
> 701f65c - rgw: guard against racing writes
> c1d57e3 - rgw: more atomicity fixes, set_olh()
> 4ffa2dd - rgw: tie set_olh() to object completion
> 019c226 - cls_rgw: olh trim op is read/write
> fc35420 - rgw: follow olh if needed
> b78481b - rgw: update json encoding for rgw_obj
> 88c2f1f - rgw: object manifest should reflect instance
> a5fea1d - rgw: add 'versioning', and 'versions' to handled subresources
> eadb243 - rgw: add get_type() to rgw ops
> 28ee25c - cls_rgw: revise the data model
> f57cc4a - rgw: bucket listing gets extra param for versioning
> 7a010c5 - rgw, cls_rgw: list object versions is optional
> 1892aaf - cls_rgw: deletion marker needs to keep instance entry
> 677c6f9 - rgw: propagate dirent flags to rgw (from cls), other fixes
> d123836 - rgw: restful api now dumps versions
> 64a66b5 - rgw: cleanup, get rid of req_state::object
> 77cdb69 - rgw: request state and various op functionality use rgw_obj_key
> f9ae1e8 - rgw: fix rgw_obj initialization
> c7cc445 - cls_rgw: update the appropriate prev key entry
> b18d36e - rgw, cls_rgw: cls_bucket_list returns raw key in map
> 704425b - rgw: add support for version-id-marker
> d5d4347 - rgw: bucket versioning status is tri-state
> 0fd49fe - rgw: initial versioned object removal implementation
> 3fb2177 - rgw, cls_rgw: don't remove olh objects
> 5962d5e - cls_rgw: allow olh linking to null instance objects
> fed201f - rgw: set olh if object has been versioned
>
> 3. Cleanup!
>
> At this point it was obvious that the code was in dire need of a
> cleanup. Trying to limit the amount of different states. Moving
> certain object operations to RGWRados subclasses. Separating data
> objects and system objects.
>
> 0998856 - rgw: move RGWRadosCtx into RGWRados
> f10469e - rgw: s/RGWRadosCtx/ObjectCtx
> b0bcedf - rgw: start reorganizing RGWRados
> 46774c2 - rgw: remove plain object processor
> 900c89a - rgw: pass around object context refrences, remove unused code
> bff2d83 - rgw: don't use put_system_obj() for data objects
> 2620bbe - rgw: get rid of put_obj_meta(), replace with put_system_obj()
> 172cceb - rgw: remove old index update calls
> 853c937 - rgw: remove unused code
> ed0076f - rgw: switch RGWRados::delete_obj() to new interface
> 32142fd - rgw: fix missing state initalization
> 1dd190b - rgw: remove more unused code
> 54a426b - rgw: rework prepare_get_obj(), get_obj()
> d485428 - rgw: change RGWRados::get_attr()
> e1041ed - rgw: clean up system obj interfaces
> 8114787 - rgw: s/RGWRados::ObjectCtx/RGWObjectCtx
> f0fa071 - rgw: adjust to new interfaces
> 0799f62 - rgw: purge intent log
> d291c08 - rgw: remove unused code
> ae33ad7 - rgw: switch get_obj_iterate() to new interface
> 8af2bcf - rgw: convert RGWRados::get_attr() to new interface
>
> and some cls_rgw cleanups:
>
> dd87374 - cls_rgw: reorganize rgw_bucket_link_olh()
> 956108d - cls_rgw: more cleanup
> 4302041 - cls_rgw: more cleanup
>
> 4. Back to versioning work
>
> More internal work, as described in (2).
>
> Also, we now have new radosgw-admin commands to list and set raw
> bucket index entries. This is really helpful in debugging issues
> related to bucket index versioning.
>
>
> 28e43ca - cls_rgw: update olh log when unlinking entry
> 7232a92 - cls_rgw: unlink object instance
>
> 518493b - rgw: unlink obj instance
> 136740c - rgw: follow olh where needed
> 56b0e6b - cls_rgw: keep null-versioned object as versioned object
> 6751908 - rgw-admin, cls_rgw: add bi_get objclass operation
> 061313d - common, rgw: json escaping gets input buf size
> 358fc98 - cls_rgw: add missing flags encoding to rgw_bucket_dir_entry::dump()
> 5d41b86 - cls_rgw, rgw-admin: move bi_get() entry encoding to cls
> 72fdef2 - cls_rgw, rgw-admin: create bi list operation
> efa541f - rgw, cls_rgw: add bi put
>
> 5. Fixes, adjustments, complete missing implementation
>
> misc stuff. Fixes, and other missing implementation.
>
> 16d5e06 - osd: fix filter_prefix scoping in omap_get_vals
> 04eeb7c - formatter: no need for dynamic allocation
> 662d805 - rgw: send "null" version id if needed
> d03c562 - rgw, cls_rgw: multiple changes related to obj removal
> 3638bdb - rgw: propagate object owner and mtime for deletion marker
> 16f2bd3 - rgw: adjust versioning enable/suspend api
> 4d3b6e3 - rgw: fix access to object through the null instance
> 520b0c7 - cls_rgw: inc olh epoch when updating log
> 9c329cc - rgw, cls_rgw: fix update of olh to reflect non existing object
> b6c0c12 - cls_rgw: add missing cls_cxx_create()
> c127068 - rgw: add dump_string_header()
> 0eacb86 - rgw: send x-amz-version-id and x-amz-delete_marker header fields
> 2481439 - cls_rgw: remove instance entry when removing delete marker
> 21dd843 - rgw: encode timestamp in pending olh info
> acec1c8 - rgw, cls_rgw: improve olh atomicity
> 2dae922 - cls_rgw: guard certain operations using olh tag
> 5d423c8 - cls_rgw: implement dump() and generate test instances
> dff4cae - cls_rgw: clean up compilation warnings
> 82766fa - rgw: remove unused code
> 8acd45b - rgw: remove warning
> 2fed1f5 - rgw: read bucket owner when following olh if pending entries
> 3d0b506 - cls_rgw: revise null object instance handling, versioned epoch
> 6f4d924 - cls_rgw: don't write list entry when converting when deleting
> f319b93 - rgw: time out pending olh entries
> 9e0f7a1 - rgw: Object::Read::read() returns total bytes read
> 20c61a8 - rgw: Object::Read operations should use state->obj
> 6be07f4 - rgw: reduce use of Object::get_obj()
> 14e1ec6 - rgw: parse copy location version id
> 750f4d7 - cls_rgw, rgw: pending_log can hold multiple entries per epoch
> a3a45cb - cls_rgw: link, unlink olh ops can get epoch
> 1266c59 - rgw, cls_rgw: provide optional version id, versioned epoch to olh ops
> 31695db - rgw: cleaup RGWRados::copy_obj()
> 4e790b8 - rgw: propagate version id when putting obj
> bd3738a - rgw: copy obj does versioning too
> 12dc4e1 - rgw: move versioning handling to Object::Write::write_meta()
> d2e9d4e - rgw: fix a few regressions
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux