Re: radosgw - stuck ops

Sage Weil <sweil@xxxxxxxxxx> · Tue, 4 Aug 2015 10:03:30 -0700 (PDT)

On Tue, 4 Aug 2015, Yehuda Sadeh-Weinraub wrote:
> On Tue, Aug 4, 2015 at 9:55 AM, Sage Weil <sweil@xxxxxxxxxx> wrote:
> >> One solution that I can think of is to determine before the read/write
> >> whether the pg we're about to access is healthy (or has been unhealthy for a
> >> short period of time), and if not to cancel the request before sending the
> >> operation. This could mitigate the problem you're seeing at the expense of
> >> availability in some cases. We'd need to have a way to query pg health
> >> through librados which we don't have right now afaik.
> >> Sage / Sam, does that make sense, and/or possible?
> >
> > This seems mostly impossible because we don't know ahead of time which
> > PG(s) a request is going to touch (it'll generally be a lot of them)?
> >
> 
> Barring pgls() and such, each rados request that radosgw produces will
> only touch a single pg, right?

Oh, yeah.  I thought you meant before each RGW request.  If it's at the 
rados level then yeah, you could avoid stuck pgs, although I think a 
better approach would be to make the OSD reply with -EAGAIN in that case 
so that you know the op didn't happen.  There would still be cases (though 
more rare) where you weren't sure if the op happened or not (e.g., when 
you send to osd A, it goes down, you resend to osd B, and then you get 
EAGAIN/timeout).

What would you do when you get that failure/timeout, though?  Is it 
practical to abort the rgw request handling completely?

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html