On Tue, Aug 4, 2015 at 10:03 AM, Sage Weil <sweil@xxxxxxxxxx> wrote: > On Tue, 4 Aug 2015, Yehuda Sadeh-Weinraub wrote: >> On Tue, Aug 4, 2015 at 9:55 AM, Sage Weil <sweil@xxxxxxxxxx> wrote: >> >> One solution that I can think of is to determine before the read/write >> >> whether the pg we're about to access is healthy (or has been unhealthy for a >> >> short period of time), and if not to cancel the request before sending the >> >> operation. This could mitigate the problem you're seeing at the expense of >> >> availability in some cases. We'd need to have a way to query pg health >> >> through librados which we don't have right now afaik. >> >> Sage / Sam, does that make sense, and/or possible? >> > >> > This seems mostly impossible because we don't know ahead of time which >> > PG(s) a request is going to touch (it'll generally be a lot of them)? >> > >> >> Barring pgls() and such, each rados request that radosgw produces will >> only touch a single pg, right? > > Oh, yeah. I thought you meant before each RGW request. If it's at the > rados level then yeah, you could avoid stuck pgs, although I think a > better approach would be to make the OSD reply with -EAGAIN in that case > so that you know the op didn't happen. There would still be cases (though > more rare) where you weren't sure if the op happened or not (e.g., when > you send to osd A, it goes down, you resend to osd B, and then you get > EAGAIN/timeout). If done on the client side then we should only make it apply to the first request sent. Is it actually a problem if the osd triggered the error? > > What would you do when you get that failure/timeout, though? Is it > practical to abort the rgw request handling completely? > It should be like any error that happens through the transaction (e.g., client disconnection). Yehuda -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html