Re: RGW: Implement S3 storage class feature

Yehuda Sadeh-Weinraub <ysadehwe@xxxxxxxxxx> · Wed, 21 Jun 2017 09:37:59 -0700

On Wed, Jun 21, 2017 at 8:50 AM, Daniel Gryniewicz <dang@xxxxxxxxxx> wrote:
> On 06/21/2017 11:14 AM, Yehuda Sadeh-Weinraub wrote:
>>
>> On Wed, Jun 21, 2017 at 7:46 AM, Daniel Gryniewicz <dang@xxxxxxxxxx>
>> wrote:
>>>
>>> On 06/21/2017 10:04 AM, Matt Benjamin wrote:
>>>>
>>>>
>>>> Hi,
>>>>
>>>> Looks very coherent.
>>>>
>>>> My main question is about...
>>>>
>>>> ----- Original Message -----
>>>>>
>>>>>
>>>>> From: "Jiaying Ren" <mikulely@xxxxxxxxx>
>>>>> To: "Yehuda Sadeh-Weinraub" <ysadehwe@xxxxxxxxxx>
>>>>> Cc: "ceph-devel" <ceph-devel@xxxxxxxxxxxxxxx>
>>>>> Sent: Wednesday, June 21, 2017 7:39:24 AM
>>>>> Subject: RGW: Implement S3 storage class feature
>>>>>
>>>>
>>>>>
>>>>> * Todo List
>>>>>
>>>>> + the head of rgw-object should only contains the metadata of
>>>>>   rgw-object,the first chunk of rgw-object data should be stored in
>>>>>   the same pool as the tail of rgw-object
>>>>
>>>>
>>>>
>>>> Is this always desirable?
>>>>
>>>
>>> Well, unless the head pool happens to have the correct storage class,
>>> it's
>>> necessary.  And I'd guess that verification of this is complicated,
>>> although
>>> maybe not.
>>>
>>> Maybe we can use the head pool if it has >= the correct storage class?
>>>
>> My original thinking was that when we reassign an object to a new
>> placement, we only touch its tail which is incompatible with that.
>> However, thinking about it some more I don't see why we need to have
>> this limitation, so it's probably possible to keep the data in the
>> head in one case, and modify the object and have the data in the tail
>> (object's head will need to be rewritten anyway because we modify the
>> manifest).
>> I think that the decision whether we keep data in the head could be a
>> property of the zone. In any case, once an object is created changing
>> this property will only affect newly created objects, and old objects
>> could still be read correctly. Having data in the head is an
>> optimization that supposedly reduces small objects latency, and I
>> still think it's useful in a mixed pools situation. The thought is
>> that the bulk of the data will be at the tail anyway. However, we
>> recently changed the default head size from 512k to 4M, so this might
>> not be true any more. Anyhow, I favour having this as a configurable
>> (which should be simple to add).
>>
>> Yehuda
>>
>
>
> I would be strongly against keeping data in the head when the head is in a
> lower-level storage class.  That means that the entire object is violating
> the constraints of the storage class.
>
> Of course, having the head in a lower storage class (data or not) is
> probably a violation.  Maybe we'd have to require that all heads go in the
> highest storage class.
>

I'd keep it simple. Note that all objects' heads in a bucket need to
reside in the same pool, otherwise we won't be able to locate them
(unless we start searching).

Yehuda
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html