Troubleshooting hanging storage backend whenever there is any cluster change

Stefan Priebe - Profihost AG <s.priebe@xxxxxxxxxxxx> · Fri, 12 Oct 2018 23:27:23 +0200

Hi David,

Am 12.10.2018 um 15:59 schrieb David Turner:
> The PGs per OSD does not change unless the OSDs are marked out.  You
> have noout set, so that doesn't change at all during this test.  All of
> your PGs peered quickly at the beginning and then were active+undersized
> the rest of the time, you never had any blocked requests, and you always
> had 100MB/s+ client IO.  I didn't see anything wrong with your cluster
> to indicate that your clients had any problems whatsoever accessing data.
> 
> Can you confirm that you saw the same problems while you were running
> those commands?  The next thing would seem that possibly a client isn't
> getting an updated OSD map to indicate that the host and its OSDs are
> down and it's stuck trying to communicate with host7.  That would
> indicate a potential problem with the client being unable to communicate
> with the Mons maybe?
May be but what about this status
'PG_AVAILABILITY Reduced data availability: pgs peering'

See the log here: https://pastebin.com/wxUKzhgB

PG_AVAILABILITY is noted at timestamps [2018-10-12 12:16:15.403394] and
[2018-10-12 12:17:40.072655].

And why does Ceph docs say:

Data availability is reduced, meaning that the cluster is unable to
service potential read or write requests for some data in the cluster.
Specifically, one or more PGs is in a state that does not allow IO
requests to be serviced. Problematic PG states include peering, stale,
incomplete, and the lack of active (if those conditions do not clear
quickly).

Greets,
Stefan
> 
> On Fri, Oct 12, 2018 at 8:35 AM Nils Fahldieck - Profihost AG
> <n.fahldieck@xxxxxxxxxxxx <mailto:n.fahldieck@xxxxxxxxxxxx>> wrote:
> 
>     Hi, in our `ceph.conf` we have:
> 
>       mon_max_pg_per_osd = 300
> 
>     While the host is offline (9 OSDs down):
> 
>       4352 PGs * 3 / 62 OSDs ~ 210 PGs per OSD
> 
>     If all OSDs are online:
> 
>       4352 PGs * 3 / 71 OSDs ~ 183 PGs per OSD
> 
>     ... so this doesn't seem to be the issue.
> 
>     If I understood you right, that's what you've meant. If I got you wrong,
>     would you mind to point to one of those threads you mentioned?
> 
>     Thanks :)
> 
>     Am 12.10.2018 um 14:03 schrieb Burkhard Linke:
>     > Hi,
>     >
>     >
>     > On 10/12/2018 01:55 PM, Nils Fahldieck - Profihost AG wrote:
>     >> I rebooted a Ceph host and logged `ceph status` & `ceph health
>     detail`
>     >> every 5 seconds. During this I encountered 'PG_AVAILABILITY
>     Reduced data
>     >> availability: pgs peering'. At the same time some VMs hung as
>     described
>     >> before.
>     >
>     > Just a wild guess... you have 71 OSDs and about 4500 PG with size=3.
>     > 13500 PG instance overall, resulting in ~190 PGs per OSD under normal
>     > circumstances.
>     >
>     > If one host is down and the PGs have to re-peer, you might reach the
>     > limit of 200 PG/OSDs on some of the OSDs, resulting in stuck peering.
>     >
>     > You can try to raise this limit. There are several threads on the
>     > mailing list about this.
>     >
>     > Regards,
>     > Burkhard
>     >
>     _______________________________________________
>     ceph-users mailing list
>     ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com