Re: Policy based object tiering in RGW

Matt Benjamin <mbenjami@xxxxxxxxxx> · Wed, 4 Apr 2018 08:01:39 -0400

Hi Folks,

inline

On Tue, Apr 3, 2018 at 3:39 AM, Varada Kari (System Engineer)
<varadaraja.kari@xxxxxxxxxxxx> wrote:
> Grangularity of policy is bucket(group of objects). S3 supports user
> policies and bucket policies. Ceph supports subset of bucket policies
> in Luminous. If we can have additional headers per object, may be we
> can handle per object policies also, but i am not sure how much work
> is that.

User policy is being worked on.  Expect PR(s) soon.  I don't think AWS
defines this sort of object policy, but if you have a well-formed
proposal, we seem to have a good foundation for implementing new
grammar and actions.

>
> For the migration, there can be a window like scrub, which can be user
> specified/modified. Criteria for moving the data would be from
> policies set on the bucket. Policies have to specify what do want to
> do like moving to a different tier or a different cluster  and some
> associated values. This might require to pass some additional
> header(specific to this) to the policy engine and decisions are taken
> based on them.

I think it goes without saying, we'd be delighted to have your team
participate in ongoing RGW qos development.  I think it would be a big
win to attend some upstream standup meetings--there are as many as 4
per week that you could join.  On the one hand, what you're aspiring
to sounds like a reasonable fit with the (d)mclock work in progress,
but if you see a need for alternate strategies, this is a great time
to help define how that could fit together in the code.

Matt

>
> I haven't thought about integrating priorities to these tasks. That
> might be converted to a different discussion on QOS and requires a
> different discussion. And it can be interpreted differently as per use
> cases.
> From RGW end, my thinking is to add QOS/throttles to the user.A user
> can be guaranteed consume x% of bandwidth/resources per unit of time
> in the cluster, it may be GET, PUT, DELETE or CREATE BUCKET or any
> background ops like tiering or migrating the data. This is mostly from
> the multi tenancy in the RGW and guaranteeing something for each user.
> But when we bring the discussion to OSD, that might be guaranteeing
> the certain number of ops(including the recovery and user io)
> completing with in the given time always(in degraded state too). Again
> this is my thinking. There are some implementations are already in
> progress based on dmclock. But i haven't tested them yet.
>
> Varada
>
> On Tue, Apr 3, 2018 at 11:51 AM, nagarrajan raghunathan
> <nagu.raghu99@xxxxxxxxx> wrote:
>> Hi ,
>>     For example if i  have cluster with video files . Say the cluster is
>> having continuous reads and writes. Now we want to apply the policy , will
>> this policy apply on each object or group of objects ? Also when the
>> migration would happen i.e. During user defined maintenance window or at
>> frequent intervals. Would it be required to associate priority with tiering
>> based on their hits.
>>
>> Thanks
>>
>>
>> On Tue, Apr 3, 2018 at 10:26 AM, Varada Kari (System Engineer)
>> <varadaraja.kari@xxxxxxxxxxxx> wrote:
>>>
>>> Sure. i was thinking, if this can be simplified using the existing
>>> functionality in rados. But i agree, if we can write a better policy
>>> engine and use the rados constructs to achieve the tiering would be
>>> ideal to do.
>>>
>>> Varada
>>>
>>> On Tue, Apr 3, 2018 at 9:38 AM, Matt Benjamin <mbenjami@xxxxxxxxxx> wrote:
>>> > I find it strange to be arguing for worse is better, but
>>> >
>>> > On Mon, Apr 2, 2018 at 11:34 PM, Varada Kari (System Engineer)
>>> > <varadaraja.kari@xxxxxxxxxxxx> wrote:
>>> >> Yes for internal data movement across pools. I am not too particular
>>> >> about using the
>>> >> current implemetation, if tiering V2 solves this better, will be
>>> >> interested to use it.
>>> >> The current problem is transferring object/bucket life cycles policies
>>> >> to rados for moving the data around.
>>> >
>>> > The problem is simplified when RGW moves the data around within as
>>> > well as across clusters.  As you note below...
>>> >
>>> >> I am not sure, if this needs a different policy engine at RGW layer,
>>> >> to transcode these policies into tiering ops to move the data to a
>>> >> different pool.
>>> >> And we have to manage/indicate this object is moved to a different
>>> >> pool and we have to bring it back or do a proxy read.
>>> >> I am thinking mostly from the object life cycle management from RGW.
>>> >>
>>> >
>>> > You want to support this anyway.
>>> >
>>> >>>
>>> >>> Especially since you're discussing moving data across clusters, and
>>> >>> RGW is already maintaining a number of indexes and things (eg, head
>>> >>> objects), I think it's probably best to have RGW maintain metadata
>>> >>> about the "real" location of uploaded objects.
>>> >>> -Greg
>>> >>>
>>> >> As one more policy on the object, we can have archiving this object to
>>> >> a different cluster. Here don't want to overload rados, but use RGW
>>> >> cloud sync or multisite to sync this data to a different cluster.
>>> >> When we starting integrating bucket/object policies to the life cycle
>>> >> management and tiering, interesting to explore on how long i want to
>>> >> it in the same pool or different pool or a different cluster.
>>> >> Varada
>>> >>>>
>>> >
>>> >
>>> >
>>> > --
>>> >
>>> > Matt Benjamin
>>> > Red Hat, Inc.
>>> > 315 West Huron Street, Suite 140A
>>> > Ann Arbor, Michigan 48103
>>> >
>>> > http://www.redhat.com/en/technologies/storage
>>> >
>>> > tel.  734-821-5101
>>> > fax.  734-769-8938
>>> > cel.  734-216-5309
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>>
>>
>> --
>> Regards,
>> Nagarrajan Raghunathan
>>
>>
>>

-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html