On Mon, Apr 2, 2018 at 11:28 PM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote: > On Fri, Mar 30, 2018 at 9:47 PM, Varada Kari (System Engineer) > <varadaraja.kari@xxxxxxxxxxxx> wrote: >> Hi, >> >> Last week at Cephalacon, had a brief discussion about this Josh. >> Wanted to get your inputs/comments on the same. >> >> At Flipkart, we are dealing with a massive object growth per month. >> Most of the objects are read couple of times and left in the bucket >> for longer duration(several years) for compliance reasons. Ceph can >> handle all that data well, but doesn't provide interfaces to move data >> to a cold storage like glacier. Right we are using offline tools to >> achieve this movement of data across pools/clusters. This proposal is >> to introduce a new idea to have multiple tiers with in the cluster and >> manage/move data around them. >> >> Idea is to use placement targets[1] for buckets, Ceph provides support >> to create custom data pools along with default .rgw.bucket pool. The >> custom pool is used for specifying a different placement strategy for >> user data, e.g. a custom pool could be added to use SSDs based OSDs. >> The index pool and other ancillary pools stays same. Placement target >> is defined by radosgw-admin under 'regions'. >> >> RGW accounts can use the bucket location/region to create the buckets, >> once buckets are created all the objects are routed without sending >> the location again. >> >> We can introduce 3 classes/pools in the cluster. >> >> One, completely on SSDs, which is used by latency sensitive >> customers. This pool is may not big in size but can accommodate more >> objects of small size. Reads and writes both are served at milli >> second latency. >> >> Second, Medium sized objects and more objects, this can have ssd based >> cache tier Backed by HDD based tier. Writes are served faster and >> reads can be latent because of tiering. >> >> Third, Big objects and no limits on number, directly to HDD based >> pool. There are not latency sensitive. This pool can be replication >> based or EC based. >> >> we can assign policies based on latencies and capacity to these pools. >> Along with these three categories, we can enable tiering among these >> pools or we can have additional pools supporting the archival. While >> creating users/accounts we can place the user in certain group and >> assign corresponding pool for object placement. Additionally we can >> enable multisite/cloud sync for the users who wants to move their data >> to different cluster based on policies. >> >> Using the bucket/Object policies, we can identify the buckets/objects >> can be vaulted and we can move them to archival tiers, based on >> policies. This simulates the Glacier kind of functionality in AWS, >> but within the cluster. As an example, User can set a policy on a >> bucket to be vaulted after couple of months to a tier in the same >> cluster or to a different cluster. >> >> We have the agents already flushing the data between cache and base >> tier. Objecter knows which pool is tier of what, we have to extend the >> functionality of Objecter to support multiple levels of tiering for >> reading/writing objects. But this has to be tied up with bucket >> policies at RGW level.Using cloud sync feature or multisite(a tweaked >> version to support bucket policies) we can vault specific objects to a >> different cluster. Haven't completely thought about the design, how >> do we want to overload objecter or we might have to design a new >> global objecter to be aware of the multisite tier. > > Hmm, it sounds like you're interested in extending the RADOS > cache-tier functionality for this. That is definitely a mistake; we > have been backing off support for that over the past several releases. > Sage has a plan for some "tiering v2" infrastructure (that integrates > with SK Telecom's dedupe work) which might fit with this but I don't > think it has any kind of timeline for completion. > Yes for internal data movement across pools. I am not too particular about using the current implemetation, if tiering V2 solves this better, will be interested to use it. The current problem is transferring object/bucket life cycles policies to rados for moving the data around. I am not sure, if this needs a different policy engine at RGW layer, to transcode these policies into tiering ops to move the data to a different pool. And we have to manage/indicate this object is moved to a different pool and we have to bring it back or do a proxy read. I am thinking mostly from the object life cycle management from RGW. > > Especially since you're discussing moving data across clusters, and > RGW is already maintaining a number of indexes and things (eg, head > objects), I think it's probably best to have RGW maintain metadata > about the "real" location of uploaded objects. > -Greg > As one more policy on the object, we can have archiving this object to a different cluster. Here don't want to overload rados, but use RGW cloud sync or multisite to sync this data to a different cluster. When we starting integrating bucket/object policies to the life cycle management and tiering, interesting to explore on how long i want to it in the same pool or different pool or a different cluster. Varada >> >> This enables us to grow across regions and supporting temperature >> based object tiering to support part of the Object Life cycle >> management. >> >> Please let me know your thoughts on this. >> >> [1] http://cephnotes.ksperis.com/blog/2014/11/28/placement-pools-on-rados-gw >> >> Thanks, >> Varada >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html