Re: Blocked requests after "osd in"

Christian Kauhaus <kc@xxxxxxxxxxxxxxx> · Thu, 10 Dec 2015 10:03:51 +0100

Am 10.12.2015 um 06:38 schrieb Robert LeBlanc:
> I noticed this a while back and did some tracing. As soon as the PGs
> are read in by the OSD (very limited amount of housekeeping done), the
> OSD is set to the "in" state so that peering with other OSDs can
> happen and the recovery process can begin. The problem is that when
> the OSD is "in", the clients also see that and start sending requests
> to the OSDs before it has had a chance to actually get its bearings
> and is able to even service the requests. After discussion with some
> of the developers, there is no easy way around this other than let the
> PGs recover to other OSDs and then bring in the OSDs after recovery (a
> ton of data movement).

Many thanks for your detailed analysis. It's a bit disappointing that there
seems to be no easy way around. Any work to improve the situation is much
appreciated.

In the meantime, I'll be experimenting with pre-seeding the VFS cache to speed
things up at least a little bit.

Regards

Christian

-- 
Dipl-Inf. Christian Kauhaus <>< · kc@xxxxxxxxxxxxxxx · +49 345 219401-0
Flying Circus Internet Operations GmbH · http://flyingcircus.io
Forsterstraße 29 · 06112 Halle (Saale) · Deutschland
HR Stendal 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick

Attachment:
signature.asc

Description: OpenPGP digital signature
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com