Re: intermittent slow requests on idle ssd ceph clusters

Glen Baars <glen@xxxxxxxxxxxxxxxxxxxxxx> · Tue, 17 Jul 2018 01:38:33 +0000

Hello Pavel,

I don't have all that much info ( fairly new to Ceph ) but we are facing a similar issue. If the cluster is fairly idle we get slow requests - if I'm backfilling a new node there is no slow requests. Same X540 network cards but ceph 12.2.5 and Ubuntu 16.04. 4.4.0 kernel. LACP with VLANs for ceph front/backend networks.

Not sure that it is the same issue but if you want me to do any tests - let me know.

Kind regards,
Glen Baars

-----Original Message-----
From: ceph-users <ceph-users-bounces@xxxxxxxxxxxxxx> On Behalf Of Xavier Trilla
Sent: Tuesday, 17 July 2018 6:16 AM
To: Pavel Shub <pavel@xxxxxxxxxxxx>; Ceph Users <ceph-users@xxxxxxxxxxxxxx>
Subject: Re:  intermittent slow requests on idle ssd ceph clusters

Hi Pavel,

Any strange messages on dmesg, syslog, etc?

I would recommend profiling the kernel with perf and checking for the calls that are consuming more CPU.

We had several problems like the one you are describing, and for example one of them got fixed increasing vm.min_free_kbytes to 4GB.

Also, how is the sys usage if you run top on the machines hosting the OSDs?

Saludos Cordiales,
Xavier Trilla P.
Clouding.io

¿Un Servidor Cloud con SSDs, redundado
y disponible en menos de 30 segundos?

¡Pruébalo ahora en Clouding.io!

-----Mensaje original-----
De: ceph-users <ceph-users-bounces@xxxxxxxxxxxxxx> En nombre de Pavel Shub Enviado el: lunes, 16 de julio de 2018 23:52
Para: Ceph Users <ceph-users@xxxxxxxxxxxxxx>
Asunto:  intermittent slow requests on idle ssd ceph clusters

Hello folks,

We've been having issues with slow requests cropping up on practically idle ceph clusters. From what I can tell the requests are hanging waiting for subops, and the OSD on the other end receives requests minutes later! Below it started waiting for subops at 12:09:51 and the subop was completed at 12:14:28.

{
"description": "osd_op(client.903117.0:569924 6.391 6:89ed76f2:::%2fraster%2fv5%2fes%2f16%2f36320%2f24112:head [writefull 0~2072] snapc 0=[] ondisk+write+known_if_redirected e5777)",
"initiated_at": "2018-07-05 12:09:51.191419",
"age": 326.651167,
"duration": 276.977834,
"type_data": {
"flag_point": "commit sent; apply or cleanup",
"client_info": {
"client": "client.903117",
"client_addr": "10.20.31.234:0/1433094386",
"tid": 569924
},
"events": [
{
"time": "2018-07-05 12:09:51.191419",
"event": "initiated"
},
{
"time": "2018-07-05 12:09:51.191471",
"event": "queued_for_pg"
},
{
"time": "2018-07-05 12:09:51.191538",
"event": "reached_pg"
},
{
"time": "2018-07-05 12:09:51.191877",
"event": "started"
},
{
"time": "2018-07-05 12:09:51.192135",
"event": "waiting for subops from 11"
},
{
"time": "2018-07-05 12:09:51.192599",
"event": "op_commit"
},
{
"time": "2018-07-05 12:09:51.192616",
"event": "op_applied"
},
{
"time": "2018-07-05 12:14:28.169018",
"event": "sub_op_commit_rec from 11"
},
{
"time": "2018-07-05 12:14:28.169164",
"event": "commit_sent"
},
{
"time": "2018-07-05 12:14:28.169253",
"event": "done"
}
]
}
},

The below is what I assume the corresponding request on osd.11, it seems to be receiving the network request ~4 minutes later.

2018-07-05 12:14:28.058552 7fb75ee0e700 20 osd.11 5777 share_map_peer
0x562b61bca000 already has epoch 5777
2018-07-05 12:14:28.167247 7fb75de0c700 10 osd.11 5777  new session
0x562cc23f0200 con=0x562baaa0e000 addr=10.16.15.28:6805/3218
2018-07-05 12:14:28.167282 7fb75de0c700 10 osd.11 5777  session
0x562cc23f0200 osd.20 has caps osdcap[grant(*)] 'allow *'
2018-07-05 12:14:28.167291 7fb75de0c700  0 -- 10.16.16.32:6817/3808 >>
10.16.15.28:6805/3218 conn(0x562baaa0e000 :6817 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 20 vs existing csq=19 existing_state=STATE_STANDBY
2018-07-05 12:14:28.167322 7fb7546d6700  2 osd.11 5777 ms_handle_reset con 0x562baaa0e000 session 0x562cc23f0200
2018-07-05 12:14:28.167546 7fb75de0c700 10 osd.11 5777  session
0x562b62195c00 osd.20 has caps osdcap[grant(*)] 'allow *'

This is an all SSD cluster with minimal load. All hardware checks return good values. The cluster is currently running latest ceph mimic
(13.2.0) but we have also experienced this on other versions of luminous 12.2.2 and 12.2.5.

I'm starting to think that this is a potential network driver issue.
We're currently running on kernel 4.14.15 and when we updated to latest 4.17 the slow requests seem to occur more frequently. The network cards that we run are 10g intel X540.

Does anyone know how I can debug this further?

Thanks,
Pavel
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
This e-mail is intended solely for the benefit of the addressee(s) and any other named recipient. It is confidential and may contain legally privileged or confidential information. If you are not the recipient, any use, distribution, disclosure or copying of this e-mail is prohibited. The confidentiality and legal privilege attached to this communication is not waived or lost by reason of the mistaken transmission or delivery to you. If you have received this e-mail in error, please notify us immediately.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com