Re: [GSoC] Queries regarding the Project

Spandan Kumar Sahu <spandankumarsahu@xxxxxxxxx> · Mon, 27 Mar 2017 21:15:17 +0530

Thanks for the explanation. Cleared the confusion.

On Mon, Mar 27, 2017 at 7:17 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> On Mon, 27 Mar 2017, Spandan Kumar Sahu wrote:
>> When CRUSH algorithm detects, that an object cannot be placed in a PG
>> that came through the draw, does CRUSH not draw another PG, until the
>> PG drawn can be written to?
>
> Nope.  The assignment of of objects to PGs is a simple deterministic hash
> function and happens before CRUSH.  (CRUSH then maps pgids to OSDs.)
>
> If we did something like you suggest then there would be many locations
> where an object could exist, you'd have to look in multiple locations to
> conclude it doesn't exist, and you'd have to deal with the range of race
> conditions that result from that due to, say, OSDs transitioning from full
> to non-full or back while you're deciding where to write or whether an
> object exists.
>
> sage
>
>
>
>>
>> On Mon, Mar 27, 2017 at 2:34 PM, kefu chai <tchaikov@xxxxxxxxx> wrote:
>> > On Fri, Mar 24, 2017 at 8:08 PM, Spandan Kumar Sahu
>> > <spandankumarsahu@xxxxxxxxx> wrote:
>> >> I understand that, we can't write to objects which belong to the
>> >> particular PG (the one having at least one full OSD). But a storage
>> >> pool can have multiple PGs, and some of them must have only non-full
>> >> OSDs. Through those PGs, we can write to the OSDs which are not full.
>> >
>> > but we cannot impose the restriction on the client that only a subset
>> > of PGs of the given pool can be written.
>> >
>> >>
>> >> Did I understand it correctly?
>> >>
>> >>
>> >> On Fri, Mar 24, 2017 at 1:01 PM, kefu chai <tchaikov@xxxxxxxxx> wrote:
>> >>> Hi Spandan,
>> >>>
>> >>> Please do not email me privately, instead use the public mailing list,
>> >>> which allows other developers to provide you help if I am unable to do
>> >>> so. it also means that you can start interacting with the rest of the
>> >>> community instead of only me (barely useful).
>> >>>
>> >>> On Fri, Mar 24, 2017 at 2:38 PM, Spandan Kumar Sahu
>> >>> <spandankumarsahu@xxxxxxxxx> wrote:
>> >>>> Hi
>> >>>>
>> >>>> I couldn't figure out, why is this happening,
>> >>>>
>> >>>> "...Because once any of the storage device assigned to a storage pool is
>> >>>> full, the whole pool is not writeable anymore, even there is abundant space
>> >>>> in other devices."
>> >>>> -- Ceph GSoC Project Ideas (Smarter reweight-by-utilisation)
>> >>>>
>> >>>> I went through this[1] paper on CRUSH, and according to what I understand,
>> >>>> CRUSH pseudo-randomly chooses a device based on weights which can reflect
>> >>>> various parameters like the amount of space available.
>> >>>
>> >>> CRUSH is a variant of consistent hashing. Ceph cannot automatically
>> >>> choose *another* OSD which is not chosen by CRUSH, even if that OSD is
>> >>> not full and has abundant space.
>> >>>
>> >>>>
>> >>>> What I don't understand is, how will it stop a pool having abundant space on
>> >>>> other devices, from getting selected, if one of its devices is full? Sure,
>> >>>> the chances of getting selected might decrease, if one device is full, but
>> >>>> how does it completely prevent writing to the pool?
>> >>>
>> >>> if a PG are served by three OSDs. if any of them is full, how can we
>> >>> continue creating/writing to objects which belong to that PG?
>> >>>
>> >>>
>> >>> --
>> >>> Regards
>> >>> Kefu Chai
>> >>
>> >>
>> >>
>> >> --
>> >> Spandan Kumar Sahu
>> >> IIT Kharagpur
>> >
>> >
>> >
>> > --
>> > Regards
>> > Kefu Chai
>>
>>
>>
>> --
>> Spandan Kumar Sahu
>> IIT Kharagpur
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>

-- 
Spandan Kumar Sahu
IIT Kharagpur
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html