On Fri, Mar 30, 2018 at 9:47 PM, Varada Kari (System Engineer) <varadaraja.kari@xxxxxxxxxxxx> wrote: > Hi, > > Last week at Cephalacon, had a brief discussion about this Josh. > Wanted to get your inputs/comments on the same. > > At Flipkart, we are dealing with a massive object growth per month. > Most of the objects are read couple of times and left in the bucket > for longer duration(several years) for compliance reasons. Ceph can > handle all that data well, but doesn't provide interfaces to move data > to a cold storage like glacier. Right we are using offline tools to > achieve this movement of data across pools/clusters. This proposal is > to introduce a new idea to have multiple tiers with in the cluster and > manage/move data around them. > > Idea is to use placement targets[1] for buckets, Ceph provides support > to create custom data pools along with default .rgw.bucket pool. The > custom pool is used for specifying a different placement strategy for > user data, e.g. a custom pool could be added to use SSDs based OSDs. > The index pool and other ancillary pools stays same. Placement target > is defined by radosgw-admin under 'regions'. > > RGW accounts can use the bucket location/region to create the buckets, > once buckets are created all the objects are routed without sending > the location again. > > We can introduce 3 classes/pools in the cluster. > > One, completely on SSDs, which is used by latency sensitive > customers. This pool is may not big in size but can accommodate more > objects of small size. Reads and writes both are served at milli > second latency. > > Second, Medium sized objects and more objects, this can have ssd based > cache tier Backed by HDD based tier. Writes are served faster and > reads can be latent because of tiering. > > Third, Big objects and no limits on number, directly to HDD based > pool. There are not latency sensitive. This pool can be replication > based or EC based. > > we can assign policies based on latencies and capacity to these pools. > Along with these three categories, we can enable tiering among these > pools or we can have additional pools supporting the archival. While > creating users/accounts we can place the user in certain group and > assign corresponding pool for object placement. Additionally we can > enable multisite/cloud sync for the users who wants to move their data > to different cluster based on policies. > > Using the bucket/Object policies, we can identify the buckets/objects > can be vaulted and we can move them to archival tiers, based on > policies. This simulates the Glacier kind of functionality in AWS, > but within the cluster. As an example, User can set a policy on a > bucket to be vaulted after couple of months to a tier in the same > cluster or to a different cluster. > > We have the agents already flushing the data between cache and base > tier. Objecter knows which pool is tier of what, we have to extend the > functionality of Objecter to support multiple levels of tiering for > reading/writing objects. But this has to be tied up with bucket > policies at RGW level.Using cloud sync feature or multisite(a tweaked > version to support bucket policies) we can vault specific objects to a > different cluster. Haven't completely thought about the design, how > do we want to overload objecter or we might have to design a new > global objecter to be aware of the multisite tier. Hmm, it sounds like you're interested in extending the RADOS cache-tier functionality for this. That is definitely a mistake; we have been backing off support for that over the past several releases. Sage has a plan for some "tiering v2" infrastructure (that integrates with SK Telecom's dedupe work) which might fit with this but I don't think it has any kind of timeline for completion. Especially since you're discussing moving data across clusters, and RGW is already maintaining a number of indexes and things (eg, head objects), I think it's probably best to have RGW maintain metadata about the "real" location of uploaded objects. -Greg > > This enables us to grow across regions and supporting temperature > based object tiering to support part of the Object Life cycle > management. > > Please let me know your thoughts on this. > > [1] http://cephnotes.ksperis.com/blog/2014/11/28/placement-pools-on-rados-gw > > Thanks, > Varada > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html