Re: Policy based object tiering in RGW

Gregory Farnum <gfarnum@xxxxxxxxxx> · Mon, 2 Apr 2018 10:58:03 -0700

On Fri, Mar 30, 2018 at 9:47 PM, Varada Kari (System Engineer)
<varadaraja.kari@xxxxxxxxxxxx> wrote:
> Hi,
>
> Last week at Cephalacon, had a brief discussion about this Josh.
> Wanted to get your inputs/comments on the same.
>
> At Flipkart, we are dealing with a massive object growth per month.
> Most of the objects are read couple of times and left in the bucket
> for longer duration(several years) for compliance reasons.  Ceph can
> handle all that data well, but doesn't provide interfaces to move data
> to a cold storage like glacier. Right we are using offline tools to
> achieve this movement of data across pools/clusters. This proposal is
> to introduce a new idea to have multiple tiers with in the cluster and
> manage/move data around them.
>
> Idea is to use placement targets[1] for buckets, Ceph provides support
> to create custom data pools along with default .rgw.bucket pool. The
> custom pool is used for specifying a different placement strategy for
> user data, e.g. a custom pool could be added to use SSDs based OSDs.
> The index pool and other ancillary pools stays same. Placement target
> is defined by radosgw-admin under 'regions'.
>
> RGW accounts can use the bucket location/region to create the buckets,
> once buckets are created all the objects are routed without sending
> the location again.
>
> We can introduce 3 classes/pools in the cluster.
>
> One,  completely on SSDs, which is used by latency sensitive
> customers. This pool is may not big in size but can accommodate more
> objects of small size. Reads and writes both are served at milli
> second latency.
>
> Second, Medium sized objects and more objects, this can have ssd based
> cache tier Backed by HDD based tier. Writes are served faster and
> reads can be latent because of tiering.
>
> Third, Big objects and no limits on number,  directly to HDD based
> pool. There are not latency sensitive. This pool can be replication
> based or EC based.
>
> we can assign policies based on latencies and capacity to these pools.
> Along with these three categories, we can enable tiering among these
> pools or we can have additional pools supporting the archival. While
> creating users/accounts we can place the user in certain group and
> assign corresponding pool for object placement. Additionally we can
> enable multisite/cloud sync for the users who wants to move their data
> to different cluster based on policies.
>
> Using the bucket/Object policies, we can identify the buckets/objects
> can be vaulted and we can move them to archival tiers, based on
> policies.  This simulates the Glacier kind of functionality in AWS,
> but within the cluster.  As an example, User can set a policy on a
> bucket to be vaulted after couple of months to a tier in the same
> cluster or to a different cluster.
>
> We have the agents already flushing the data between cache and base
> tier. Objecter knows which pool is tier of what, we have to extend the
> functionality of Objecter to support multiple levels of tiering for
> reading/writing objects. But this has to be tied up with bucket
> policies at RGW level.Using cloud sync feature or multisite(a tweaked
> version to support bucket policies) we can vault specific objects to a
> different cluster.  Haven't completely thought about the design, how
> do we want to overload objecter or we might have to design a new
> global objecter to be aware of the multisite tier.

Hmm, it sounds like you're interested in extending the RADOS
cache-tier functionality for this. That is definitely a mistake; we
have been backing off support for that over the past several releases.
Sage has a plan for some "tiering v2" infrastructure (that integrates
with SK Telecom's dedupe work) which might fit with this but I don't
think it has any kind of timeline for completion.

Especially since you're discussing moving data across clusters, and
RGW is already maintaining a number of indexes and things (eg, head
objects), I think it's probably best to have RGW maintain metadata
about the "real" location of uploaded objects.
-Greg

>
> This enables us to grow across regions and supporting temperature
> based object tiering to support part of the Object Life cycle
> management.
>
> Please let me know your thoughts on this.
>
> [1] http://cephnotes.ksperis.com/blog/2014/11/28/placement-pools-on-rados-gw
>
> Thanks,
> Varada
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html