Policy based object tiering in RGW

"Varada Kari (System Engineer)" <varadaraja.kari@xxxxxxxxxxxx> · Sat, 31 Mar 2018 10:17:03 +0530

Hi,

Last week at Cephalacon, had a brief discussion about this Josh.
Wanted to get your inputs/comments on the same.

At Flipkart, we are dealing with a massive object growth per month.
Most of the objects are read couple of times and left in the bucket
for longer duration(several years) for compliance reasons.  Ceph can
handle all that data well, but doesn't provide interfaces to move data
to a cold storage like glacier. Right we are using offline tools to
achieve this movement of data across pools/clusters. This proposal is
to introduce a new idea to have multiple tiers with in the cluster and
manage/move data around them.

Idea is to use placement targets[1] for buckets, Ceph provides support
to create custom data pools along with default .rgw.bucket pool. The
custom pool is used for specifying a different placement strategy for
user data, e.g. a custom pool could be added to use SSDs based OSDs.
The index pool and other ancillary pools stays same. Placement target
is defined by radosgw-admin under 'regions'.

RGW accounts can use the bucket location/region to create the buckets,
once buckets are created all the objects are routed without sending
the location again.

We can introduce 3 classes/pools in the cluster.

One,  completely on SSDs, which is used by latency sensitive
customers. This pool is may not big in size but can accommodate more
objects of small size. Reads and writes both are served at milli
second latency.

Second, Medium sized objects and more objects, this can have ssd based
cache tier Backed by HDD based tier. Writes are served faster and
reads can be latent because of tiering.

Third, Big objects and no limits on number,  directly to HDD based
pool. There are not latency sensitive. This pool can be replication
based or EC based.

we can assign policies based on latencies and capacity to these pools.
Along with these three categories, we can enable tiering among these
pools or we can have additional pools supporting the archival. While
creating users/accounts we can place the user in certain group and
assign corresponding pool for object placement. Additionally we can
enable multisite/cloud sync for the users who wants to move their data
to different cluster based on policies.

Using the bucket/Object policies, we can identify the buckets/objects
can be vaulted and we can move them to archival tiers, based on
policies.  This simulates the Glacier kind of functionality in AWS,
but within the cluster.  As an example, User can set a policy on a
bucket to be vaulted after couple of months to a tier in the same
cluster or to a different cluster.

We have the agents already flushing the data between cache and base
tier. Objecter knows which pool is tier of what, we have to extend the
functionality of Objecter to support multiple levels of tiering for
reading/writing objects. But this has to be tied up with bucket
policies at RGW level.Using cloud sync feature or multisite(a tweaked
version to support bucket policies) we can vault specific objects to a
different cluster.  Haven't completely thought about the design, how
do we want to overload objecter or we might have to design a new
global objecter to be aware of the multisite tier.

This enables us to grow across regions and supporting temperature
based object tiering to support part of the Object Life cycle
management.

Please let me know your thoughts on this.

[1] http://cephnotes.ksperis.com/blog/2014/11/28/placement-pools-on-rados-gw

Thanks,
Varada
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html