Re: slow requests are blocked > 32 sec. Implicated osds 0, 2, 3, 4, 5 (REQUEST_SLOW)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Robert,
I did not make any changes, so I'm still using the prio queue.
Regards

Le lun. 10 juin 2019 à 17:44, Robert LeBlanc <robert@xxxxxxxxxxxxx> a écrit :
I'm glad it's working, to be clear did you use wpq, or is it still the prio queue?

Sent from a mobile device, please excuse any typos.

On Mon, Jun 10, 2019, 4:45 AM BASSAGET Cédric <cedric.bassaget.ml@xxxxxxxxx> wrote:
an update from 12.2.9 to 12.2.12 seems to have fixed the problem !

Le lun. 10 juin 2019 à 12:25, BASSAGET Cédric <cedric.bassaget.ml@xxxxxxxxx> a écrit :
Hi Robert,
Before doing anything on my prod env, I generate r/w on ceph cluster using fio .
On my newest cluster, release 12.2.12, I did not manage to get the (REQUEST_SLOW) warning, even if my OSD disk usage goes above 95% (fio ran from 4 diffrent hosts)

On my prod cluster, release 12.2.9, as soon as I run fio on a single host, I see a lot of REQUEST_SLOW warninr gmessages, but "iostat -xd 1" does not show me a usage more that 5-10% on disks...

Le lun. 10 juin 2019 à 10:12, Robert LeBlanc <robert@xxxxxxxxxxxxx> a écrit :
On Mon, Jun 10, 2019 at 1:00 AM BASSAGET Cédric <cedric.bassaget.ml@xxxxxxxxx> wrote:
Hello Robert,
My disks did not reach 100% on the last warning, they climb to 70-80% usage. But I see rrqm / wrqm counters increasing...

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util

sda               0.00     4.00    0.00   16.00     0.00   104.00    13.00     0.00    0.00    0.00    0.00   0.00   0.00
sdb               0.00     2.00    1.00 3456.00     8.00 25996.00    15.04     5.76    1.67    0.00    1.67   0.03   9.20
sdd               4.00     0.00 41462.00 1119.00 331272.00  7996.00    15.94    19.89    0.47    0.48    0.21   0.02  66.00

dm-0              0.00     0.00 6825.00  503.00 330856.00  7996.00    92.48     4.00    0.55    0.56    0.30   0.09  66.80
dm-1              0.00     0.00    1.00 1129.00     8.00 25996.00    46.02     1.03    0.91    0.00    0.91   0.09  10.00


sda is my system disk (SAMSUNG   MZILS480HEGR/007  GXL0), sdb and sdd are my OSDs

would "osd op queue = wpq" help in this case ?
Regards

Your disk times look okay, just a lot more unbalanced than I would expect. I'd give wpq a try, I use it all the time, just be sure to also include the op_cutoff setting too or it doesn't have much effect. Let me know how it goes.
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux