Re: Troubleshooting hanging storage backend whenever there is any cluster change

Stefan Priebe - Profihost AG <s.priebe@xxxxxxxxxxxx> · Sat, 13 Oct 2018 21:29:57 +0200

ods.19 is a bluestore osd on a healthy 2TB SSD.

Log of osd.19 is here:
https://pastebin.com/raw/6DWwhS0A

Am 13.10.2018 um 21:20 schrieb Stefan Priebe - Profihost AG:
> Hi David,
> 
> i think this should be the problem - form a new log from today:
> 
> 2018-10-13 20:57:20.367326 mon.a [WRN] Health check update: 4 osds down
> (OSD_DOWN)
> ...
> 2018-10-13 20:57:41.268674 mon.a [WRN] Health check update: Reduced data
> availability: 3 pgs peering (PG_AVAILABILITY)
> ...
> 2018-10-13 20:58:08.684451 mon.a [WRN] Health check failed: 1 osds down
> (OSD_DOWN)
> ...
> 2018-10-13 20:58:22.841210 mon.a [WRN] Health check failed: Reduced data
> availability: 8 pgs inactive (PG_AVAILABILITY)
> ....
> 2018-10-13 20:58:47.570017 mon.a [WRN] Health check update: Reduced data
> availability: 5 pgs inactive (PG_AVAILABILITY)
> ...
> 2018-10-13 20:58:49.142108 osd.19 [WRN] Monitor daemon marked osd.19
> down, but it is still running
> 2018-10-13 20:58:53.750164 mon.a [WRN] Health check update: Reduced data
> availability: 3 pgs inactive (PG_AVAILABILITY)
> ...
> 
> so there is a timeframe of > 90s whee PGs are inactive and unavail -
> this would at least explain stalled I/O to me?
> 
> Greets,
> Stefan
> 
> 
> Am 12.10.2018 um 15:59 schrieb David Turner:
>> The PGs per OSD does not change unless the OSDs are marked out.  You
>> have noout set, so that doesn't change at all during this test.  All of
>> your PGs peered quickly at the beginning and then were active+undersized
>> the rest of the time, you never had any blocked requests, and you always
>> had 100MB/s+ client IO.  I didn't see anything wrong with your cluster
>> to indicate that your clients had any problems whatsoever accessing data.
>>
>> Can you confirm that you saw the same problems while you were running
>> those commands?  The next thing would seem that possibly a client isn't
>> getting an updated OSD map to indicate that the host and its OSDs are
>> down and it's stuck trying to communicate with host7.  That would
>> indicate a potential problem with the client being unable to communicate
>> with the Mons maybe?  Have you completely ruled out any network problems
>> between all nodes and all of the IPs in the cluster.  What does your
>> client log show during these times?
>>
>> On Fri, Oct 12, 2018 at 8:35 AM Nils Fahldieck - Profihost AG
>> <n.fahldieck@xxxxxxxxxxxx <mailto:n.fahldieck@xxxxxxxxxxxx>> wrote:
>>
>>     Hi, in our `ceph.conf` we have:
>>
>>       mon_max_pg_per_osd = 300
>>
>>     While the host is offline (9 OSDs down):
>>
>>       4352 PGs * 3 / 62 OSDs ~ 210 PGs per OSD
>>
>>     If all OSDs are online:
>>
>>       4352 PGs * 3 / 71 OSDs ~ 183 PGs per OSD
>>
>>     ... so this doesn't seem to be the issue.
>>
>>     If I understood you right, that's what you've meant. If I got you wrong,
>>     would you mind to point to one of those threads you mentioned?
>>
>>     Thanks :)
>>
>>     Am 12.10.2018 um 14:03 schrieb Burkhard Linke:
>>     > Hi,
>>     >
>>     >
>>     > On 10/12/2018 01:55 PM, Nils Fahldieck - Profihost AG wrote:
>>     >> I rebooted a Ceph host and logged `ceph status` & `ceph health
>>     detail`
>>     >> every 5 seconds. During this I encountered 'PG_AVAILABILITY
>>     Reduced data
>>     >> availability: pgs peering'. At the same time some VMs hung as
>>     described
>>     >> before.
>>     >
>>     > Just a wild guess... you have 71 OSDs and about 4500 PG with size=3.
>>     > 13500 PG instance overall, resulting in ~190 PGs per OSD under normal
>>     > circumstances.
>>     >
>>     > If one host is down and the PGs have to re-peer, you might reach the
>>     > limit of 200 PG/OSDs on some of the OSDs, resulting in stuck peering.
>>     >
>>     > You can try to raise this limit. There are several threads on the
>>     > mailing list about this.
>>     >
>>     > Regards,
>>     > Burkhard
>>     >
>>     _______________________________________________
>>     ceph-users mailing list
>>     ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com