Re: radosgw - stuck ops

Yehuda Sadeh-Weinraub <ysadehwe@xxxxxxxxxx> · Tue, 4 Aug 2015 10:14:06 -0700

On Tue, Aug 4, 2015 at 10:03 AM, Sage Weil <sweil@xxxxxxxxxx> wrote:
> On Tue, 4 Aug 2015, Yehuda Sadeh-Weinraub wrote:
>> On Tue, Aug 4, 2015 at 9:55 AM, Sage Weil <sweil@xxxxxxxxxx> wrote:
>> >> One solution that I can think of is to determine before the read/write
>> >> whether the pg we're about to access is healthy (or has been unhealthy for a
>> >> short period of time), and if not to cancel the request before sending the
>> >> operation. This could mitigate the problem you're seeing at the expense of
>> >> availability in some cases. We'd need to have a way to query pg health
>> >> through librados which we don't have right now afaik.
>> >> Sage / Sam, does that make sense, and/or possible?
>> >
>> > This seems mostly impossible because we don't know ahead of time which
>> > PG(s) a request is going to touch (it'll generally be a lot of them)?
>> >
>>
>> Barring pgls() and such, each rados request that radosgw produces will
>> only touch a single pg, right?
>
> Oh, yeah.  I thought you meant before each RGW request.  If it's at the
> rados level then yeah, you could avoid stuck pgs, although I think a
> better approach would be to make the OSD reply with -EAGAIN in that case
> so that you know the op didn't happen.  There would still be cases (though
> more rare) where you weren't sure if the op happened or not (e.g., when
> you send to osd A, it goes down, you resend to osd B, and then you get
> EAGAIN/timeout).

If done on the client side then we should only make it apply to the
first request sent. Is it actually a problem if the osd triggered the
error?

>
> What would you do when you get that failure/timeout, though?  Is it
> practical to abort the rgw request handling completely?
>

It should be like any error that happens through the transaction
(e.g., client disconnection).

Yehuda
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html