rgw object versioning review

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I squashed the commits that I originally marked for squashing, and
everything is pushed into wip-rgw-versioning-2.
Following is a rough breakdown of the different development phases,
and what I (think I) achieved in each one, hopefully it would give you
some hint of the direction when reviewing. Note that I'm still missing
some multi-zone related changes, and there are a couple of known
regressions, but that should stop you.

See http://wiki.ceph.com/Development/RGW_Object_Versioning for design doc.


1. Initial work

In this initial phase, the idea was to add a version (or instance)
identifier to objects. This allows creating multiple objects with the
same name, but different instance ids. Objects can be accessed
directly by using their names + version. The bucket index holds
entries by name + instance. Versioned objects here are just regular
objects that are named differently.

5a95240 - rgw: add versioning_enabled field to bucket info
d4aa2ae - rgw: get bucket versioning status op
4d7ffbd - rgw: restful op to set bucket versioning
38275cb - rgw: enable s3 get/set versioning ops
655ae55 - rgw, cls_rgw: add accounted_size for object metadata entry
ae13500 - rgw: remove unused code
727b9e8 - rgw: decouple object name from index representation
ef3982f - rgw: rename rgw_obj::key to rgw_obj::loc
d603ddf - rgw, cls_rgw: various datastructures use new rgw_obj_key
c44e574 - rgw: rename cls_rgw_obj::key to cls_rgw_obj::loc
e3dc560 - cls_rgw: change data structures to keep single object key structure
3e9177e - rgw: adapt to new objclass interface
ad9f460 - radosgw-admin: adapt to new interfaces
e81ef5f - test: cls_rgw fixes
3bfa127 - rgw: clean up some locator use
d242339 - radosgw-admin: some commands use object_version param
c97a05a - rgw: generate random instance id
162c53b - rgw: interface adjustment following a rebase
8ecef67 - rgw: remove old unused code

2. OLH

Started olh logic implementation. Adding new cls/rgw calls:
 - link olh
 - unlink instance
 - read olh log
 - trim olh log

New nucket index representation took a couple of iterations to settle.
Major issue that was discovered mid-process was that versioned objects
need to be sorted from newest to oldest when listing bucket.
We now have 3 different kind of entries in the bucket index:
 - plain
 - instance
 - olh

Plain entries are entries that repesent the objects' listing order.
These entries are named as follows (for versioned objects, non
versioned objects are treated as before):
<name> \0 <decreasing_str(olh_epoch)> \0 <instance id>

The instance entries reside in a different namespace, and objects are
indexed there by their name and their version id. Thus, in order to
get to the listing entries, we need to first read the instance entry.

The olh entry contains the olh log, and the current olh epoch.


Extra complexity: handling null-versioned objects (objects that are
created on buckets with suspended versioning). Main changes: added a
new entry to the bucket index for each versioned object to mark it as
versioned (indexed only by its name). Every regular object that is
overwritten is converted to a versioned object.
Also, needed to co-locate olh and data objects, since objects can be
versioned, but also have a 'null' version that needs to match the
non-versioned case. So, an olh can point at itself, and when removing
an object, we make sure that we don't remove the olh.



27a9408 - rgw: initial olh implementation
105c4e0 - rgw: gen rand lower alphanumberic string
f6a21cd - rgw: adjust return code when generating random strings
e65a1fa - rgw: gen rand lowercase string (stl string version)
cc55d9e - rgw: init olh tag
d01b4c7 - rgw: some code cleanup
8ae3c3c - rgw: obj_stat() follows on olh
932e3b1 - cls_rgw: prepare groundwork for olh
4211f28 - cls_rgw: encode / decode obj and list index keys
3ed6627 - cls_rgw: bucket index link olh
fc2f127 - cls_rgw: object instance olh linking
4ac79e5 - rgw: bucket index link olh interface
56185d6 - rgw: new api to retrieve olh log
9f8f158 - rgw: implement rgw_bucket_olh_log_entry::dump()
c92a332 - rgw-admin: add olh readlog command
717e7ec - cls_rgw: olh init op
946611c - rgw: apply olh log functionality
d6d2f58 - rgw, cls_rgw: trim olh log functionality
4f9afa7 - rgw: olh atomicity groundwork
701f65c - rgw: guard against racing writes
c1d57e3 - rgw: more atomicity fixes, set_olh()
4ffa2dd - rgw: tie set_olh() to object completion
019c226 - cls_rgw: olh trim op is read/write
fc35420 - rgw: follow olh if needed
b78481b - rgw: update json encoding for rgw_obj
88c2f1f - rgw: object manifest should reflect instance
a5fea1d - rgw: add 'versioning', and 'versions' to handled subresources
eadb243 - rgw: add get_type() to rgw ops
28ee25c - cls_rgw: revise the data model
f57cc4a - rgw: bucket listing gets extra param for versioning
7a010c5 - rgw, cls_rgw: list object versions is optional
1892aaf - cls_rgw: deletion marker needs to keep instance entry
677c6f9 - rgw: propagate dirent flags to rgw (from cls), other fixes
d123836 - rgw: restful api now dumps versions
64a66b5 - rgw: cleanup, get rid of req_state::object
77cdb69 - rgw: request state and various op functionality use rgw_obj_key
f9ae1e8 - rgw: fix rgw_obj initialization
c7cc445 - cls_rgw: update the appropriate prev key entry
b18d36e - rgw, cls_rgw: cls_bucket_list returns raw key in map
704425b - rgw: add support for version-id-marker
d5d4347 - rgw: bucket versioning status is tri-state
0fd49fe - rgw: initial versioned object removal implementation
3fb2177 - rgw, cls_rgw: don't remove olh objects
5962d5e - cls_rgw: allow olh linking to null instance objects
fed201f - rgw: set olh if object has been versioned

3. Cleanup!

At this point it was obvious that the code was in dire need of a
cleanup. Trying to limit the amount of different states. Moving
certain object operations to RGWRados subclasses. Separating data
objects and system objects.

0998856 - rgw: move RGWRadosCtx into RGWRados
f10469e - rgw: s/RGWRadosCtx/ObjectCtx
b0bcedf - rgw: start reorganizing RGWRados
46774c2 - rgw: remove plain object processor
900c89a - rgw: pass around object context refrences, remove unused code
bff2d83 - rgw: don't use put_system_obj() for data objects
2620bbe - rgw: get rid of put_obj_meta(), replace with put_system_obj()
172cceb - rgw: remove old index update calls
853c937 - rgw: remove unused code
ed0076f - rgw: switch RGWRados::delete_obj() to new interface
32142fd - rgw: fix missing state initalization
1dd190b - rgw: remove more unused code
54a426b - rgw: rework prepare_get_obj(), get_obj()
d485428 - rgw: change RGWRados::get_attr()
e1041ed - rgw: clean up system obj interfaces
8114787 - rgw: s/RGWRados::ObjectCtx/RGWObjectCtx
f0fa071 - rgw: adjust to new interfaces
0799f62 - rgw: purge intent log
d291c08 - rgw: remove unused code
ae33ad7 - rgw: switch get_obj_iterate() to new interface
8af2bcf - rgw: convert RGWRados::get_attr() to new interface

and some cls_rgw cleanups:

dd87374 - cls_rgw: reorganize rgw_bucket_link_olh()
956108d - cls_rgw: more cleanup
4302041 - cls_rgw: more cleanup

4. Back to versioning work

More internal work, as described in (2).

Also, we now have new radosgw-admin commands to list and set raw
bucket index entries. This is really helpful in debugging issues
related to bucket index versioning.


28e43ca - cls_rgw: update olh log when unlinking entry
7232a92 - cls_rgw: unlink object instance

518493b - rgw: unlink obj instance
136740c - rgw: follow olh where needed
56b0e6b - cls_rgw: keep null-versioned object as versioned object
6751908 - rgw-admin, cls_rgw: add bi_get objclass operation
061313d - common, rgw: json escaping gets input buf size
358fc98 - cls_rgw: add missing flags encoding to rgw_bucket_dir_entry::dump()
5d41b86 - cls_rgw, rgw-admin: move bi_get() entry encoding to cls
72fdef2 - cls_rgw, rgw-admin: create bi list operation
efa541f - rgw, cls_rgw: add bi put

5. Fixes, adjustments, complete missing implementation

misc stuff. Fixes, and other missing implementation.

16d5e06 - osd: fix filter_prefix scoping in omap_get_vals
04eeb7c - formatter: no need for dynamic allocation
662d805 - rgw: send "null" version id if needed
d03c562 - rgw, cls_rgw: multiple changes related to obj removal
3638bdb - rgw: propagate object owner and mtime for deletion marker
16f2bd3 - rgw: adjust versioning enable/suspend api
4d3b6e3 - rgw: fix access to object through the null instance
520b0c7 - cls_rgw: inc olh epoch when updating log
9c329cc - rgw, cls_rgw: fix update of olh to reflect non existing object
b6c0c12 - cls_rgw: add missing cls_cxx_create()
c127068 - rgw: add dump_string_header()
0eacb86 - rgw: send x-amz-version-id and x-amz-delete_marker header fields
2481439 - cls_rgw: remove instance entry when removing delete marker
21dd843 - rgw: encode timestamp in pending olh info
acec1c8 - rgw, cls_rgw: improve olh atomicity
2dae922 - cls_rgw: guard certain operations using olh tag
5d423c8 - cls_rgw: implement dump() and generate test instances
dff4cae - cls_rgw: clean up compilation warnings
82766fa - rgw: remove unused code
8acd45b - rgw: remove warning
2fed1f5 - rgw: read bucket owner when following olh if pending entries
3d0b506 - cls_rgw: revise null object instance handling, versioned epoch
6f4d924 - cls_rgw: don't write list entry when converting when deleting
f319b93 - rgw: time out pending olh entries
9e0f7a1 - rgw: Object::Read::read() returns total bytes read
20c61a8 - rgw: Object::Read operations should use state->obj
6be07f4 - rgw: reduce use of Object::get_obj()
14e1ec6 - rgw: parse copy location version id
750f4d7 - cls_rgw, rgw: pending_log can hold multiple entries per epoch
a3a45cb - cls_rgw: link, unlink olh ops can get epoch
1266c59 - rgw, cls_rgw: provide optional version id, versioned epoch to olh ops
31695db - rgw: cleaup RGWRados::copy_obj()
4e790b8 - rgw: propagate version id when putting obj
bd3738a - rgw: copy obj does versioning too
12dc4e1 - rgw: move versioning handling to Object::Write::write_meta()
d2e9d4e - rgw: fix a few regressions
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux