Re: High iowait on OSD node

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm using bcache (starting around the middle of December...before that see way higher await) for all the 12 hdds on the 2 SSDs, and NVMe for journals. (and some months ago I changed all the 2TB disks to 6TB and added ceph4,5)

Here's my iostat in ganglia:

just raw per disk await
    http://www.brockmann-consult.de/ganglia/graph_all_periods.php?title=&vl=&x=&n=&hreg[]=ceph.*&mreg[]=sd[a-z]_await&gtype=line&glegend=show&aggregate=1
per host max await
    http://www.brockmann-consult.de/ganglia/graph_all_periods.php?title=&vl=&x=&n=&hreg[]=ceph.*&mreg[]=max_await&gtype=line&glegend=show&aggregate=1

strangely aggregated data (my max metric is the max disk, but ganglia averages out across disk/host or something, so it's not really a max)
    http://www.brockmann-consult.de/ganglia/graph_all_periods.php?c=ceph&m=network_report&r=week&s=by%20name&hc=4&mc=2&st=1501155678&g=disk_wait_report&z=large

or to explore and make your own graphs, start from here: http://www.brockmann-consult.de/ganglia/

I didn't find any ganglia plugins for that, so I wrote some that take 30s averages every minute from iostat and stores them, so when you see numbers like 400 in my data, it could have been steady 400 for 30 seconds, or 4000 for 3 seconds and then 0 for 27 seconds averaged together, and 30s of every minute is missing from the data.

In my data, sda,b,c on ceph1,2,3 are probably always the SSDs, and sdm,n on ceph4,5 are currently the SSDs and possibly were sda,b once; sometimes rebooting changes it (yeah not ideal but not sure how to change it... maybe a udev rule to name ssds differently).

And also note that I found deadline instead of CFQ scheduler has way lower iowait and latency, but not necessarily more throughput or iops... you could test that; but not using CFQ might disable some ceph priority settings (or maybe not relevant since Jewel?).

ps. use fixed width on your iostat and it's more readable in html supporting email clients...see below where I changed it

On 07/27/17 05:48, John Petrini wrote:
Hello list,

Just curious if anyone has ever seen this behavior and might have some ideas on how to troubleshoot it. 

We're seeing very high iowait in iostat across all OSD's in on a single OSD host. It's very spiky - dropping to zero and then shooting up to as high as 400 in some cases. Despite this it does not seem to be having a major impact on the cluster performance as a whole.

Some more details:
3x OSD Nodes - Dell R730's: 24 cores @2.6GHz, 256GB RAM, 20x 1.2TB 10K SAS OSD's per node.

We're running ceph hammer.

Here's the output of iostat. Note that this is from a period when the cluster is not very busy but you can still see high spikes on a few OSD's. It's much worse during high load.

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00    0.50     0.00     6.00    24.00     0.00    8.00    0.00    8.00   8.00   0.40
sdb               0.00     0.00    0.00   60.00     0.00   808.00    26.93     0.00    0.07    0.00    0.07   0.03   0.20
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00     0.00    0.00   67.00     0.00  1010.00    30.15     0.01    0.09    0.00    0.09   0.09   0.60
sde               0.00     0.00    0.00   93.00     0.00   868.00    18.67     0.00    0.04    0.00    0.04   0.04   0.40
sdf               0.00     0.00    0.00   57.50     0.00   572.00    19.90     0.00    0.03    0.00    0.03   0.03   0.20
sdg               0.00     1.00    0.00    3.50     0.00    22.00    12.57     0.75   16.00    0.00   16.00   2.86   1.00
sdh               0.00     0.00    1.50   25.50     6.00   458.50    34.41     2.03   75.26    0.00   79.69   3.04   8.20
sdi               0.00     0.00    0.00   30.50     0.00   384.50    25.21     2.36   77.51    0.00   77.51   3.28  10.00
sdj               0.00     1.00    1.50  105.00     6.00   925.75    17.50    10.85  101.84    8.00  103.18   2.35  25.00
sdl               0.00     0.00    2.00    0.00   320.00     0.00   320.00     0.01    3.00    3.00    0.00   2.00   0.40
sdk               0.00     1.00    0.00   55.00     0.00   334.50    12.16     7.92  136.91    0.00  136.91   2.51  13.80
sdm               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdn               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdo               0.00     0.00    1.00    0.00     4.00     0.00     8.00     0.00    4.00    4.00    0.00   4.00   0.40
sdp               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdq               0.50     0.00  756.00    0.00 93288.00     0.00   246.79     1.47    1.95    1.95    0.00   1.17  88.60
sdr               0.00     0.00    1.00    0.00     4.00     0.00     8.00     0.00    4.00    4.00    0.00   4.00   0.40
sds               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdt               0.00     0.00    0.00   36.50     0.00   643.50    35.26     3.49   95.73    0.00   95.73   2.63   9.60
sdu               0.00     0.00    0.00   21.00     0.00   323.25    30.79     0.78   37.24    0.00   37.24   2.95   6.20
sdv               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdw               0.00     0.00    0.00   31.00     0.00   689.50    44.48     2.48   80.06    0.00   80.06   3.29  10.20
sdx               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.50     0.00     6.00    24.00     0.00    8.00    0.00    8.00   8.00   0.40
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


-- 

--------------------------------------------
Peter Maloney
Brockmann Consult
Max-Planck-Str. 2
21502 Geesthacht
Germany
Tel: +49 4152 889 300
Fax: +49 4152 889 333
E-mail: peter.maloney@xxxxxxxxxxxxxxxxxxxx
Internet: http://www.brockmann-consult.de
--------------------------------------------
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux