Re: Policy based object tiering in RGW

Casey Bodley <cbodley@xxxxxxxxxx> · Wed, 4 Apr 2018 09:35:13 -0400

On 04/04/2018 09:19 AM, Varada Kari (System Engineer) wrote:
On Wed, Apr 4, 2018 at 5:31 PM, Matt Benjamin <mbenjami@xxxxxxxxxx> wrote:
Hi Folks,

inline

On Tue, Apr 3, 2018 at 3:39 AM, Varada Kari (System Engineer)
<varadaraja.kari@xxxxxxxxxxxx> wrote:
Grangularity of policy is bucket(group of objects). S3 supports user
policies and bucket policies. Ceph supports subset of bucket policies
in Luminous. If we can have additional headers per object, may be we
can handle per object policies also, but i am not sure how much work
is that.
User policy is being worked on.  Expect PR(s) soon.  I don't think AWS
defines this sort of object policy, but if you have a well-formed
proposal, we seem to have a good foundation for implementing new
grammar and actions.

Yeah, AWS doesn't have Object policies. I am mostly coming from, if we
have a big object(
may be a video etc...) we can associate a policy to the object, to
delete or move out once
the expiration policy kicks in. And mostly thinking about a non-prefix
based implementation
and overcome that by adding appropriate headers. we might have to do
additional work to read
them and make a decision. But i don't have well formed document to it
right now, we are still working on it.

Lifecycle filters based on object tags could be a good alternative to 
prefixes, and should be pretty easy to implement.

For the migration, there can be a window like scrub, which can be user
specified/modified. Criteria for moving the data would be from
policies set on the bucket. Policies have to specify what do want to
do like moving to a different tier or a different cluster  and some
associated values. This might require to pass some additional
header(specific to this) to the policy engine and decisions are taken
based on them.
I think it goes without saying, we'd be delighted to have your team
participate in ongoing RGW qos development.  I think it would be a big
win to attend some upstream standup meetings--there are as many as 4
per week that you could join.  On the one hand, what you're aspiring
to sounds like a reasonable fit with the (d)mclock work in progress,
but if you see a need for alternate strategies, this is a great time
to help define how that could fit together in the code.

Sure. Could you please add me(varadaraja.kari@xxxxxxxxxxxx), Vishal
<vishal.kanaujia@xxxxxxxxxxxx>
and Abhishek<abhishek.varshney@xxxxxxxxxxxx> to the standups?

Thanks,
Varada
Matt

I haven't thought about integrating priorities to these tasks. That
might be converted to a different discussion on QOS and requires a
different discussion. And it can be interpreted differently as per use
cases.
 From RGW end, my thinking is to add QOS/throttles to the user.A user
can be guaranteed consume x% of bandwidth/resources per unit of time
in the cluster, it may be GET, PUT, DELETE or CREATE BUCKET or any
background ops like tiering or migrating the data. This is mostly from
the multi tenancy in the RGW and guaranteeing something for each user.
But when we bring the discussion to OSD, that might be guaranteeing
the certain number of ops(including the recovery and user io)
completing with in the given time always(in degraded state too). Again
this is my thinking. There are some implementations are already in
progress based on dmclock. But i haven't tested them yet.

Varada

On Tue, Apr 3, 2018 at 11:51 AM, nagarrajan raghunathan
<nagu.raghu99@xxxxxxxxx> wrote:
Hi ,
     For example if i  have cluster with video files . Say the cluster is
having continuous reads and writes. Now we want to apply the policy , will
this policy apply on each object or group of objects ? Also when the
migration would happen i.e. During user defined maintenance window or at
frequent intervals. Would it be required to associate priority with tiering
based on their hits.

Thanks

On Tue, Apr 3, 2018 at 10:26 AM, Varada Kari (System Engineer)
<varadaraja.kari@xxxxxxxxxxxx> wrote:
Sure. i was thinking, if this can be simplified using the existing
functionality in rados. But i agree, if we can write a better policy
engine and use the rados constructs to achieve the tiering would be
ideal to do.

Varada

On Tue, Apr 3, 2018 at 9:38 AM, Matt Benjamin <mbenjami@xxxxxxxxxx> wrote:
I find it strange to be arguing for worse is better, but

On Mon, Apr 2, 2018 at 11:34 PM, Varada Kari (System Engineer)
<varadaraja.kari@xxxxxxxxxxxx> wrote:
Yes for internal data movement across pools. I am not too particular
about using the
current implemetation, if tiering V2 solves this better, will be
interested to use it.
The current problem is transferring object/bucket life cycles policies
to rados for moving the data around.
The problem is simplified when RGW moves the data around within as
well as across clusters.  As you note below...

I am not sure, if this needs a different policy engine at RGW layer,
to transcode these policies into tiering ops to move the data to a
different pool.
And we have to manage/indicate this object is moved to a different
pool and we have to bring it back or do a proxy read.
I am thinking mostly from the object life cycle management from RGW.

You want to support this anyway.

Especially since you're discussing moving data across clusters, and
RGW is already maintaining a number of indexes and things (eg, head
objects), I think it's probably best to have RGW maintain metadata
about the "real" location of uploaded objects.
-Greg

As one more policy on the object, we can have archiving this object to
a different cluster. Here don't want to overload rados, but use RGW
cloud sync or multisite to sync this data to a different cluster.
When we starting integrating bucket/object policies to the life cycle
management and tiering, interesting to explore on how long i want to
it in the same pool or different pool or a different cluster.
Varada

--

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Regards,
Nagarrajan Raghunathan

--

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html