Re: Policy based object tiering in RGW

Abhishek Lekshmanan <abhishek@xxxxxxxx> · Thu, 05 Apr 2018 18:34:13 +0200

Casey Bodley <cbodley@xxxxxxxxxx> writes:

> On 04/04/2018 09:19 AM, Varada Kari (System Engineer) wrote:
>> On Wed, Apr 4, 2018 at 5:31 PM, Matt Benjamin <mbenjami@xxxxxxxxxx> wrote:
>>> Hi Folks,
>>>
>>> inline
>>>
>>> On Tue, Apr 3, 2018 at 3:39 AM, Varada Kari (System Engineer)
>>> <varadaraja.kari@xxxxxxxxxxxx> wrote:
>>>> Grangularity of policy is bucket(group of objects). S3 supports user
>>>> policies and bucket policies. Ceph supports subset of bucket policies
>>>> in Luminous. If we can have additional headers per object, may be we
>>>> can handle per object policies also, but i am not sure how much work
>>>> is that.
>>> User policy is being worked on.  Expect PR(s) soon.  I don't think AWS
>>> defines this sort of object policy, but if you have a well-formed
>>> proposal, we seem to have a good foundation for implementing new
>>> grammar and actions.
>>>
>> Yeah, AWS doesn't have Object policies. I am mostly coming from, if we
>> have a big object(
>> may be a video etc...) we can associate a policy to the object, to
>> delete or move out once
>> the expiration policy kicks in. And mostly thinking about a non-prefix
>> based implementation
>> and overcome that by adding appropriate headers. we might have to do
>> additional work to read
>> them and make a decision. But i don't have well formed document to it
>> right now, we are still working on it.
>
> Lifecycle filters based on object tags could be a good alternative to 
> prefixes, and should be pretty easy to implement.

Actually current master/mimic already has support for Lifecycle filters
based on object tags, though these are expensive and read only when
Lifecycle has the filters set up.
>>>> For the migration, there can be a window like scrub, which can be user
>>>> specified/modified. Criteria for moving the data would be from
>>>> policies set on the bucket. Policies have to specify what do want to
>>>> do like moving to a different tier or a different cluster  and some
>>>> associated values. This might require to pass some additional
>>>> header(specific to this) to the policy engine and decisions are taken
>>>> based on them.
>>> I think it goes without saying, we'd be delighted to have your team
>>> participate in ongoing RGW qos development.  I think it would be a big
>>> win to attend some upstream standup meetings--there are as many as 4
>>> per week that you could join.  On the one hand, what you're aspiring
>>> to sounds like a reasonable fit with the (d)mclock work in progress,
>>> but if you see a need for alternate strategies, this is a great time
>>> to help define how that could fit together in the code.
>>>
>> Sure. Could you please add me(varadaraja.kari@xxxxxxxxxxxx), Vishal
>> <vishal.kanaujia@xxxxxxxxxxxx>
>> and Abhishek<abhishek.varshney@xxxxxxxxxxxx> to the standups?
>>
>> Thanks,
>> Varada
>>> Matt
>>>
>>>> I haven't thought about integrating priorities to these tasks. That
>>>> might be converted to a different discussion on QOS and requires a
>>>> different discussion. And it can be interpreted differently as per use
>>>> cases.
>>>>  From RGW end, my thinking is to add QOS/throttles to the user.A user
>>>> can be guaranteed consume x% of bandwidth/resources per unit of time
>>>> in the cluster, it may be GET, PUT, DELETE or CREATE BUCKET or any
>>>> background ops like tiering or migrating the data. This is mostly from
>>>> the multi tenancy in the RGW and guaranteeing something for each user.
>>>> But when we bring the discussion to OSD, that might be guaranteeing
>>>> the certain number of ops(including the recovery and user io)
>>>> completing with in the given time always(in degraded state too). Again
>>>> this is my thinking. There are some implementations are already in
>>>> progress based on dmclock. But i haven't tested them yet.
>>>>
>>>> Varada
>>>>
>>>> On Tue, Apr 3, 2018 at 11:51 AM, nagarrajan raghunathan
>>>> <nagu.raghu99@xxxxxxxxx> wrote:
>>>>> Hi ,
>>>>>      For example if i  have cluster with video files . Say the cluster is
>>>>> having continuous reads and writes. Now we want to apply the policy , will
>>>>> this policy apply on each object or group of objects ? Also when the
>>>>> migration would happen i.e. During user defined maintenance window or at
>>>>> frequent intervals. Would it be required to associate priority with tiering
>>>>> based on their hits.
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>> On Tue, Apr 3, 2018 at 10:26 AM, Varada Kari (System Engineer)
>>>>> <varadaraja.kari@xxxxxxxxxxxx> wrote:
>>>>>> Sure. i was thinking, if this can be simplified using the existing
>>>>>> functionality in rados. But i agree, if we can write a better policy
>>>>>> engine and use the rados constructs to achieve the tiering would be
>>>>>> ideal to do.
>>>>>>
>>>>>> Varada
>>>>>>
>>>>>> On Tue, Apr 3, 2018 at 9:38 AM, Matt Benjamin <mbenjami@xxxxxxxxxx> wrote:
>>>>>>> I find it strange to be arguing for worse is better, but
>>>>>>>
>>>>>>> On Mon, Apr 2, 2018 at 11:34 PM, Varada Kari (System Engineer)
>>>>>>> <varadaraja.kari@xxxxxxxxxxxx> wrote:
>>>>>>>> Yes for internal data movement across pools. I am not too particular
>>>>>>>> about using the
>>>>>>>> current implemetation, if tiering V2 solves this better, will be
>>>>>>>> interested to use it.
>>>>>>>> The current problem is transferring object/bucket life cycles policies
>>>>>>>> to rados for moving the data around.
>>>>>>> The problem is simplified when RGW moves the data around within as
>>>>>>> well as across clusters.  As you note below...
>>>>>>>
>>>>>>>> I am not sure, if this needs a different policy engine at RGW layer,
>>>>>>>> to transcode these policies into tiering ops to move the data to a
>>>>>>>> different pool.
>>>>>>>> And we have to manage/indicate this object is moved to a different
>>>>>>>> pool and we have to bring it back or do a proxy read.
>>>>>>>> I am thinking mostly from the object life cycle management from RGW.
>>>>>>>>
>>>>>>> You want to support this anyway.
>>>>>>>
>>>>>>>>> Especially since you're discussing moving data across clusters, and
>>>>>>>>> RGW is already maintaining a number of indexes and things (eg, head
>>>>>>>>> objects), I think it's probably best to have RGW maintain metadata
>>>>>>>>> about the "real" location of uploaded objects.
>>>>>>>>> -Greg
>>>>>>>>>
>>>>>>>> As one more policy on the object, we can have archiving this object to
>>>>>>>> a different cluster. Here don't want to overload rados, but use RGW
>>>>>>>> cloud sync or multisite to sync this data to a different cluster.
>>>>>>>> When we starting integrating bucket/object policies to the life cycle
>>>>>>>> management and tiering, interesting to explore on how long i want to
>>>>>>>> it in the same pool or different pool or a different cluster.
>>>>>>>> Varada
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Matt Benjamin
>>>>>>> Red Hat, Inc.
>>>>>>> 315 West Huron Street, Suite 140A
>>>>>>> Ann Arbor, Michigan 48103
>>>>>>>
>>>>>>> http://www.redhat.com/en/technologies/storage
>>>>>>>
>>>>>>> tel.  734-821-5101
>>>>>>> fax.  734-769-8938
>>>>>>> cel.  734-216-5309
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>> Nagarrajan Raghunathan
>>>>>
>>>>>
>>>>>
>>>
>>>
>>> --
>>>
>>> Matt Benjamin
>>> Red Hat, Inc.
>>> 315 West Huron Street, Suite 140A
>>> Ann Arbor, Michigan 48103
>>>
>>> http://www.redhat.com/en/technologies/storage
>>>
>>> tel.  734-821-5101
>>> fax.  734-769-8938
>>> cel.  734-216-5309
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html