Re: How Rados gateway stripes and combines object data

Prasad Bhalerao <prasadbhalerao1983@xxxxxxxxx> · Thu, 23 Nov 2017 14:12:11 +0530

Hello Wido,

Thank you for the response.
Sorry for not framing the questions with correct word. By ceph I meant Rados Gateway. Used the word ceph generically. My apologies!

I understand that RGW stripes the as per stripes size, but why does it again divide the individual stripe into more smaller chunk as per the configured chunk size (max_chunk_size)? What benefit do we get by doing this?
I meant is it not enough just to stripe the object into stripes , why to break stripes into more smaller chunks?

If an object is divided into series of smaller units (for performance benefit) , how does RGW returns the complete object when GET request is made? 
How does RGW understands which all chunks belongs to requested object and how does it combine them?

Where does it store the ids/numbers of subsequent stripes to form a complete object from its smaller chunks?

Thanks,
Prasad

On Thu, Nov 23, 2017 at 1:45 PM, Wido den Hollander <wido@xxxxxxxx> wrote:

> Op 23 november 2017 om 7:43 schreef Prasad Bhalerao <prasadbhalerao1983@xxxxxxxxx>:

>

>

> Hi,

>

> I am new to CEPH and I have some question regarding it?

>

> Could you please help out?

>

Yes, but keep in mind a lot of these questions have been answered previously.

> What is default value of rgw_stripe_size and max_chunk_size?

>

RGW stripes in 4MB RADOS objects.

> What is default size of object stored in Ceph Storage Cluster? Is it

> dependent on stripe size or chunk size?

>

None, the application on top of RADOS decides how large the stripes are.

> What is bucket with respect to RGW? How one should decide the name of

> bucket? Does creating too many bucket (different bucket per request)

> creates performance problem?

>

Please refer to the Amazon S3 model, that will explain what buckets are.

You can create millions of buckets without having a performance impact.

> Why does CEPH first stripes the data into series of stripes and then again

> divide these stripes into smaller chunks? Is striping data into stripes not

> enough?

>

It doesn't. RGW stripes over RADOS, RADOS itself doesn't stripe data.

> If an object is divided into series of smaller units (for performance

> benefit) , how does CEPH returns the complete object when GET request is

> made?

>

> Where does it store the ids/numbers of subsequent stripes to form a

> complete object from its smaller chunks?

>

> Does striping a small object (e.g. 100 KB to 4 MB) creates a performance

> overhead as CEPH has to read all the chunks related to this Object and then

> combine it into one single object before it returns it? Isn't it too much

> optimization for handling smaller objects?

>

> Does librados (ceph native apis) also perform data striping if used for

> storing data into CEPH cluster?

>

No, RADOS itself does not stripe objects.

Wido

> Thanks,

> Prasad

> _______________________________________________

> ceph-users mailing list

> ceph-users@xxxxxxxxxxxxxx

> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com