----- Original Message ----- > From: "Samuel Just" <sjust@xxxxxxxxxx> > To: "Gregory Farnum" <gfarnum@xxxxxxxxxx> > Cc: "GuangYang" <yguang11@xxxxxxxxxxx>, ceph-devel@xxxxxxxxxxxxxxx > Sent: Wednesday, 28 October, 2015 7:05:42 AM > Subject: Re: PG: all requests stuck when acting set < min_size > > Actually, we really can't accept reads below min_size and still keep > the properties we want it to have. Suppose we have 3 osds (a, b, and > c) which see writes 0...1000. min_size is 2. If a and b are then > powered off only having committed up to 900 (therefore the client > could only have seen up to 900 commit), then c would be able to serve > reads based on updates up to 1000 with a and b stopped (no way to know > a and b only committed to 900). If c then stops and a and b are > restarted, they would begin serving reads and writes only based on > commits up to 900 even though we would have exposed the writes up to > 1000 to the client. If a and/or b then accept a write you have a recipe for split-brain and no one wants to see that in Ceph. Cheers, Brad > -Sam > > On Tue, Oct 27, 2015 at 12:47 PM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote: > > On Tue, Oct 27, 2015 at 11:47 AM, GuangYang <yguang11@xxxxxxxxxxx> wrote: > >> Hi there, > >> Is there any reason we stuck read only requests as well for a PG when the > >> acting set size is less than min_size? > > > > A few. > > The most important reason: PGs don't have any concept of a read-only > > mode in the code. They are "active" or not, and an active PG handles > > writes. (The full flags and other things which block writes but allow > > reads are at the OSD level, not the PG level, and are handled when ops > > come in before they reach the PG.) Allowing read requests against a PG > > to complete even when we aren't taking writes on a per-PG level would > > take some doing. > > Also: it would be weird from several different levels. We'd need to > > keep track of client streams because we wouldn't want to let through a > > read that is ordered after a write. How would we handle the memory > > pressure implied by that? While I can imagine it being useful for some > > stuff like RGW reads, in general making data available for read but > > not write is a pretty complicated thing to explain to users — how do > > we expose that in a useful way? > > -Greg > > -- > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html