Thanks for the explanation. Cleared the confusion. On Mon, Mar 27, 2017 at 7:17 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote: > On Mon, 27 Mar 2017, Spandan Kumar Sahu wrote: >> When CRUSH algorithm detects, that an object cannot be placed in a PG >> that came through the draw, does CRUSH not draw another PG, until the >> PG drawn can be written to? > > Nope. The assignment of of objects to PGs is a simple deterministic hash > function and happens before CRUSH. (CRUSH then maps pgids to OSDs.) > > If we did something like you suggest then there would be many locations > where an object could exist, you'd have to look in multiple locations to > conclude it doesn't exist, and you'd have to deal with the range of race > conditions that result from that due to, say, OSDs transitioning from full > to non-full or back while you're deciding where to write or whether an > object exists. > > sage > > > >> >> On Mon, Mar 27, 2017 at 2:34 PM, kefu chai <tchaikov@xxxxxxxxx> wrote: >> > On Fri, Mar 24, 2017 at 8:08 PM, Spandan Kumar Sahu >> > <spandankumarsahu@xxxxxxxxx> wrote: >> >> I understand that, we can't write to objects which belong to the >> >> particular PG (the one having at least one full OSD). But a storage >> >> pool can have multiple PGs, and some of them must have only non-full >> >> OSDs. Through those PGs, we can write to the OSDs which are not full. >> > >> > but we cannot impose the restriction on the client that only a subset >> > of PGs of the given pool can be written. >> > >> >> >> >> Did I understand it correctly? >> >> >> >> >> >> On Fri, Mar 24, 2017 at 1:01 PM, kefu chai <tchaikov@xxxxxxxxx> wrote: >> >>> Hi Spandan, >> >>> >> >>> Please do not email me privately, instead use the public mailing list, >> >>> which allows other developers to provide you help if I am unable to do >> >>> so. it also means that you can start interacting with the rest of the >> >>> community instead of only me (barely useful). >> >>> >> >>> On Fri, Mar 24, 2017 at 2:38 PM, Spandan Kumar Sahu >> >>> <spandankumarsahu@xxxxxxxxx> wrote: >> >>>> Hi >> >>>> >> >>>> I couldn't figure out, why is this happening, >> >>>> >> >>>> "...Because once any of the storage device assigned to a storage pool is >> >>>> full, the whole pool is not writeable anymore, even there is abundant space >> >>>> in other devices." >> >>>> -- Ceph GSoC Project Ideas (Smarter reweight-by-utilisation) >> >>>> >> >>>> I went through this[1] paper on CRUSH, and according to what I understand, >> >>>> CRUSH pseudo-randomly chooses a device based on weights which can reflect >> >>>> various parameters like the amount of space available. >> >>> >> >>> CRUSH is a variant of consistent hashing. Ceph cannot automatically >> >>> choose *another* OSD which is not chosen by CRUSH, even if that OSD is >> >>> not full and has abundant space. >> >>> >> >>>> >> >>>> What I don't understand is, how will it stop a pool having abundant space on >> >>>> other devices, from getting selected, if one of its devices is full? Sure, >> >>>> the chances of getting selected might decrease, if one device is full, but >> >>>> how does it completely prevent writing to the pool? >> >>> >> >>> if a PG are served by three OSDs. if any of them is full, how can we >> >>> continue creating/writing to objects which belong to that PG? >> >>> >> >>> >> >>> -- >> >>> Regards >> >>> Kefu Chai >> >> >> >> >> >> >> >> -- >> >> Spandan Kumar Sahu >> >> IIT Kharagpur >> > >> > >> > >> > -- >> > Regards >> > Kefu Chai >> >> >> >> -- >> Spandan Kumar Sahu >> IIT Kharagpur >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> -- Spandan Kumar Sahu IIT Kharagpur -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html