Project Zipper - Proposed RGW Layering API

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The goal of Project Zipper is to create an API that can be used to separate
the RGW top end (represented by Swift and S3) from bottom end providers
(represented currently by RADOS, and eventually by Cloud, File, and Memory
backends).  This should allow the front ends to access any implemented back
end transparently.

The secondary objective is to allow multiple layers of these API providers
to stack up, allowing transformations of operations (such as redirection to
different backends), or caching layers that cut across multiple backends.

The proposal here is to add handles to RGW, each with an associated
interface.  Each layer will consume the handles of the one below it, and
provide handles to the one above is, transforming as necessary, and
encapsulating any data/code needed at this level in it's handles.  Not that
the API proposed here is preliminary, and may need to grow as actual code
is written.

My analysis of the current state of the code indicates that we need 4
primary handle types: A Store, a User, a Bucket, and an Object.

A Store is a handle that represents a layer as a whole.  There will be a
singleton handle for each Store, instantiated at startup (or at layer
creation time if dynamic layers are added).  A Store is primarily
responsible for creating other handles, and for providing access to
layer-local data and code to those handles.  The Store of the RADOS layer,
for example, will be based on the existing RGWRados object.  The primary
API components of a Store are:

    ListBuckets
        This method will list buckets in a store, with associated
        filtering.
    CreateBucket
        This method will create a new bucket in a store, returning
        it's handle.
    GetBucket
        This method will get a handle to an existing bucket.
    GetUser
        This method will get a handle to an existing user.
    Release
        This method will release the handle when done.

A User is a handle that represents an authentication rendezvous point for
things like permissions and quota checking.  It is primarily passed into
other methods, and has few API methods of it's own; most layers will use
local methods on the User handle that are specific to that layer.  It is
also intended to allow a layer to do user mapping; so, for example, an
Azure cloud layer would map the S3 user provided to RGW by the client into
an Azure user for the purposes of performing operations.  The only required
API function currently is:

    ListBuckets
        This method lists the buckets owned by the user.
    Release
        This method will release the handle when done.

A Bucket is a handle that represents a container of objects.  It has
several attributes, such as a Name and an Owner.  Since all objects are
contained in a bucket, the Bucket handle will be used by almost every
operation.  The primary API components of a Bucket are:

    ListObjects
        This method lists he objects in the bucket.
    CreateObject
        This method creates an object in the bucket and returns
        it's handle.
    GetObject
        This method gets a handle for an existing object.
    Get/SetAttrs
        These methods read and write attributes of the bucket.
    Get/SetACL
        These methods read and write ACLs on the bucket.  Note that
        these may not be necessary, as this work can be done with
        Get/SetAttrs, but ACLs are so commonly used that it seems
        likely that ACL specific convenience functions are useful.
    Delete
        This method deletes the bucket.
    Release
        This method will release the handle when done.

An Object handle represents a blob of data with it's associated metadata.
This is the primary handle used to do actual work.  A handle may or may not
represent a materialized object in the physical data store, depending on
the requirements of the Store itself.  For example, a Store may not
materialize an object until it has been written.  API components are:

    Read
        This method reads the object.
    Write
        This method writes to the object.
    Get/SetAttrs
        These methods read and write attributes of the object.
    Get/SetACL
        These methods read and write ACLs on the object.  Like the
        Bucket methods, these may not end up being needed.
    Delete
        This method deletes the object.
    Release
        This method will release the handle when done.

Each API method is, by design, atomic, and is the final arbiter of whether
or not it can succeed.  So, for example, a layer may do permission checking
based on ACLs, and decide that a write is allowed, and then call the Write
method of a lower layer.  That Write could still fail due to permissions
issues, if the ACL of the object was changed in the interim. The lower
layer should guarantee, however, that the permission check done in it's
Write method is atomic with respect to the actual write.

The goal of Project Zipper is to unzip across the code at the level of
RGWOp and it's children.  Code that is generic and can use the API will end
up above the zipper, and code that is specific to RADOS will end up below
the zipper, forming the first bottom layer, the RADOS provider. Once this
is done, a second provider (probably memory, but maybe DB) will be written
to validate that the API is generic enough.


Daniel




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux