Re: Policy based object tiering in RGW

"Varada Kari (System Engineer)" <varadaraja.kari@xxxxxxxxxxxx> · Wed, 4 Apr 2018 18:49:37 +0530



On Wed, Apr 4, 2018 at 5:31 PM, Matt Benjamin <mbenjami@xxxxxxxxxx> wrote:
> Hi Folks,
>
> inline
>
> On Tue, Apr 3, 2018 at 3:39 AM, Varada Kari (System Engineer)
> <varadaraja.kari@xxxxxxxxxxxx> wrote:
>> Grangularity of policy is bucket(group of objects). S3 supports user
>> policies and bucket policies. Ceph supports subset of bucket policies
>> in Luminous. If we can have additional headers per object, may be we
>> can handle per object policies also, but i am not sure how much work
>> is that.
>
> User policy is being worked on.  Expect PR(s) soon.  I don't think AWS
> defines this sort of object policy, but if you have a well-formed
> proposal, we seem to have a good foundation for implementing new
> grammar and actions.
>
Yeah, AWS doesn't have Object policies. I am mostly coming from, if we
have a big object(
may be a video etc...) we can associate a policy to the object, to
delete or move out once
the expiration policy kicks in. And mostly thinking about a non-prefix
based implementation
and overcome that by adding appropriate headers. we might have to do
additional work to read
them and make a decision. But i don't have well formed document to it
right now, we are still working on it.

>>
>> For the migration, there can be a window like scrub, which can be user
>> specified/modified. Criteria for moving the data would be from
>> policies set on the bucket. Policies have to specify what do want to
>> do like moving to a different tier or a different cluster  and some
>> associated values. This might require to pass some additional
>> header(specific to this) to the policy engine and decisions are taken
>> based on them.
>
> I think it goes without saying, we'd be delighted to have your team
> participate in ongoing RGW qos development.  I think it would be a big
> win to attend some upstream standup meetings--there are as many as 4
> per week that you could join.  On the one hand, what you're aspiring
> to sounds like a reasonable fit with the (d)mclock work in progress,
> but if you see a need for alternate strategies, this is a great time
> to help define how that could fit together in the code.
>
Sure. Could you please add me(varadaraja.kari@xxxxxxxxxxxx), Vishal
<vishal.kanaujia@xxxxxxxxxxxx>
and Abhishek<abhishek.varshney@xxxxxxxxxxxx> to the standups?

Thanks,
Varada
> Matt
>
>>
>> I haven't thought about integrating priorities to these tasks. That
>> might be converted to a different discussion on QOS and requires a
>> different discussion. And it can be interpreted differently as per use
>> cases.
>> From RGW end, my thinking is to add QOS/throttles to the user.A user
>> can be guaranteed consume x% of bandwidth/resources per unit of time
>> in the cluster, it may be GET, PUT, DELETE or CREATE BUCKET or any
>> background ops like tiering or migrating the data. This is mostly from
>> the multi tenancy in the RGW and guaranteeing something for each user.
>> But when we bring the discussion to OSD, that might be guaranteeing
>> the certain number of ops(including the recovery and user io)
>> completing with in the given time always(in degraded state too). Again
>> this is my thinking. There are some implementations are already in
>> progress based on dmclock. But i haven't tested them yet.
>>
>> Varada
>>
>> On Tue, Apr 3, 2018 at 11:51 AM, nagarrajan raghunathan
>> <nagu.raghu99@xxxxxxxxx> wrote:
>>> Hi ,
>>>     For example if i  have cluster with video files . Say the cluster is
>>> having continuous reads and writes. Now we want to apply the policy , will
>>> this policy apply on each object or group of objects ? Also when the
>>> migration would happen i.e. During user defined maintenance window or at
>>> frequent intervals. Would it be required to associate priority with tiering
>>> based on their hits.
>>>
>>> Thanks
>>>
>>>
>>> On Tue, Apr 3, 2018 at 10:26 AM, Varada Kari (System Engineer)
>>> <varadaraja.kari@xxxxxxxxxxxx> wrote:
>>>>
>>>> Sure. i was thinking, if this can be simplified using the existing
>>>> functionality in rados. But i agree, if we can write a better policy
>>>> engine and use the rados constructs to achieve the tiering would be
>>>> ideal to do.
>>>>
>>>> Varada
>>>>
>>>> On Tue, Apr 3, 2018 at 9:38 AM, Matt Benjamin <mbenjami@xxxxxxxxxx> wrote:
>>>> > I find it strange to be arguing for worse is better, but
>>>> >
>>>> > On Mon, Apr 2, 2018 at 11:34 PM, Varada Kari (System Engineer)
>>>> > <varadaraja.kari@xxxxxxxxxxxx> wrote:
>>>> >> Yes for internal data movement across pools. I am not too particular
>>>> >> about using the
>>>> >> current implemetation, if tiering V2 solves this better, will be
>>>> >> interested to use it.
>>>> >> The current problem is transferring object/bucket life cycles policies
>>>> >> to rados for moving the data around.
>>>> >
>>>> > The problem is simplified when RGW moves the data around within as
>>>> > well as across clusters.  As you note below...
>>>> >
>>>> >> I am not sure, if this needs a different policy engine at RGW layer,
>>>> >> to transcode these policies into tiering ops to move the data to a
>>>> >> different pool.
>>>> >> And we have to manage/indicate this object is moved to a different
>>>> >> pool and we have to bring it back or do a proxy read.
>>>> >> I am thinking mostly from the object life cycle management from RGW.
>>>> >>
>>>> >
>>>> > You want to support this anyway.
>>>> >
>>>> >>>
>>>> >>> Especially since you're discussing moving data across clusters, and
>>>> >>> RGW is already maintaining a number of indexes and things (eg, head
>>>> >>> objects), I think it's probably best to have RGW maintain metadata
>>>> >>> about the "real" location of uploaded objects.
>>>> >>> -Greg
>>>> >>>
>>>> >> As one more policy on the object, we can have archiving this object to
>>>> >> a different cluster. Here don't want to overload rados, but use RGW
>>>> >> cloud sync or multisite to sync this data to a different cluster.
>>>> >> When we starting integrating bucket/object policies to the life cycle
>>>> >> management and tiering, interesting to explore on how long i want to
>>>> >> it in the same pool or different pool or a different cluster.
>>>> >> Varada
>>>> >>>>
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> > Matt Benjamin
>>>> > Red Hat, Inc.
>>>> > 315 West Huron Street, Suite 140A
>>>> > Ann Arbor, Michigan 48103
>>>> >
>>>> > http://www.redhat.com/en/technologies/storage
>>>> >
>>>> > tel.  734-821-5101
>>>> > fax.  734-769-8938
>>>> > cel.  734-216-5309
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Nagarrajan Raghunathan
>>>
>>>
>>>
>
>
>
> --
>
> Matt Benjamin
> Red Hat, Inc.
> 315 West Huron Street, Suite 140A
> Ann Arbor, Michigan 48103
>
> http://www.redhat.com/en/technologies/storage
>
> tel.  734-821-5101
> fax.  734-769-8938
> cel.  734-216-5309
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html