Re: ceph mds slow requests

Eugen Block <eblock@xxxxxx> · Mon, 15 Jun 2020 07:03:33 +0000

Hi,

although I've read many threads mentioning these two osd config  
options I didn't know what to expect of it. But since you explicitly  
referred the slow requests I decided to give it a try and changed 'osd  
op queue cut off' to "high" ('osd op queue' was already "wpq"). We've  
had two deep-scrub cycles since then, half of our PGs have been  
deep-scrubbed and I haven't seen a single slow request.

So thanks again for pointing that out!

Regards,
Eugen

Zitat von Andrej Filipcic <andrej.filipcic@xxxxxx>:

Hi,

all our slow request issues were solved with:
[osd]
  osd op queue = wpq
  osd op queue cut off = high

before we even had  several hours old request, since the change it  
rarely gets above 30s even with the heaviest loads, eg >100 iops/hdd

Regards,
Andrej

On 6/10/20 12:12 PM, Eugen Block wrote:
Hi,

we have this message almost daily, although in our case it's almost  
expected. We run a nightly compile job within a cephfs subtree and  
the OSDs (HDD with rocksDB on SSD) are saturated during those jobs.  
Also the deep-scrubs which also run during the night have a  
significant impact and the cluster reports slow requests, but since  
that happens outside our working hours we can live with it (for now).

You write the OSDs are on SSDs, is that true for both data and  
metadata pool?

Regards,
Eugen

Zitat von locallocal <locallocal@xxxxxxx>:

hi,guys.
we have a ceph cluster which version is luminous 12.2.13. and  
Recently we encountered a problem.here are some log infomations:

2020-06-08 12:33:52.706070 7f4097e2d700  0 log_channel(cluster)  
log [WRN] : slow request 30.518930 seconds old, received at  
2020-06-08 12:33:22.186924:  
client_request(client.48978906:941633993 create  
#0x100028cab8a/.filename 2020-06-08 12:33:22.197434 caller_uid=0,  
caller_gid=0{}) currently submit entry: journal_and_reply
...
2020-06-08 13:12:17.826727 7f4097e2d700  0 log_channel(cluster)  
log [WRN] : slow request 2220.991833 seconds old, received at  
2020-06-08 12:35:16.764233:  
client_request(client.42390705:788369155 create  
#0x1000224f999/.filename 2020-06-08 12:35:16.774553 caller_uid=0,  
caller_gid=0{}) currently submit entry: journal_and_reply

it looks like mds can't flush journal to osd of meta pool.but the  
osd type is ssd and the load is very low.this problem leads the  
client can't mount and the mds can't trim log.
Is there anyone have encountered this problem.Please help!

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

--
_____________________________________________________________
   prof. dr. Andrej Filipcic,   E-mail: Andrej.Filipcic@xxxxxx
   Department of Experimental High Energy Physics - F9
   Jozef Stefan Institute, Jamova 39, P.o.Box 3000
   SI-1001 Ljubljana, Slovenia
   Tel.: +386-1-477-3674    Fax: +386-1-477-3166
-------------------------------------------------------------
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx