I've just added another server ( same specs) with one osd and the behavior is the same - bad performance ..cur MB/s 0
Check network with iperf3 ..no issues
So it is not a server issue since I am getting same behavior with 2 different servers
... but I checked network with iperf3 ..no issues
What can it be ?
ceph osd df tree
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME
-1 3.44714 - 588G 80693M 509G 0 0 - root default
-9 0.57458 - 588G 80693M 509G 13.39 1.13 - host osd01
5 hdd 0.57458 1.00000 588G 80693M 509G 13.39 1.13 64 osd.5
-7 1.14899 - 1176G 130G 1046G 11.06 0.94 - host osd02
0 hdd 0.57500 1.00000 588G 70061M 519G 11.63 0.98 50 osd.0
1 hdd 0.57500 1.00000 588G 63200M 526G 10.49 0.89 41 osd.1
-3 1.14899 - 1176G 138G 1038G 11.76 1.00 - host osd03
2 hdd 0.57500 1.00000 588G 68581M 521G 11.38 0.96 48 osd.2
3 hdd 0.57500 1.00000 588G 73185M 516G 12.15 1.03 53 osd.3
-4 0.57458 - 0 0 0 0 0 - host osd04
4 hdd 0.57458 0 0 0 0 0 0 0 osd.4
2018-04-10 15:11:58.542507 min lat: 0.0201432 max lat: 13.9308 avg lat: 0.466235
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
40 16 1294 1278 127.785 0 - 0.466235
41 16 1294 1278 124.668 0 - 0.466235
42 16 1294 1278 121.7 0 - 0.466235
43 16 1294 1278 118.87 0 - 0.466235
44 16 1302 1286 116.896 6.4 0.0302793 0.469203
45 16 1395 1379 122.564 372 0.312525 0.51994
46 16 1458 1442 125.377 252 0.0387492 0.501892
47 16 1458 1442 122.709 0 - 0.501892
48 16 1458 1442 120.153 0 - 0.501892
49 16 1458 1442 117.701 0 - 0.501892
50 16 1522 1506 120.466 64 0.137913 0.516969
51 16 1522 1506 118.104 0 - 0.516969
52 16 1522 1506 115.833 0 - 0.516969
53 16 1522 1506 113.648 0 - 0.516969
54 16 1522 1506 111.543 0 - 0.516969
55 16 1522 1506 109.515 0 - 0.516969
56 16 1522 1506 107.559 0 - 0.516969
57 16 1522 1506 105.672 0 - 0.516969
58 16 1522 1506 103.851 0 - 0.516969
Total time run: 58.927431
Total reads made: 1522
Read size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 103.314
Average IOPS: 25
Stddev IOPS: 35
Max IOPS: 106
Min IOPS: 0
Average Latency(s): 0.618812
Max latency(s): 13.9308
Min latency(s): 0.0201432
iperf3 -c 192.168.0.181 -i1 -t 10
Connecting to host 192.168.0.181, port 5201
[ 4] local 192.168.0.182 port 57448 connected to 192.168.0.181 port 5201
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 1.15 GBytes 9.92 Gbits/sec 0 830 KBytes
[ 4] 1.00-2.00 sec 1.15 GBytes 9.90 Gbits/sec 0 830 KBytes
[ 4] 2.00-3.00 sec 1.15 GBytes 9.91 Gbits/sec 0 918 KBytes
[ 4] 3.00-4.00 sec 1.15 GBytes 9.90 Gbits/sec 0 918 KBytes
[ 4] 4.00-5.00 sec 1.15 GBytes 9.90 Gbits/sec 0 918 KBytes
[ 4] 5.00-6.00 sec 1.15 GBytes 9.90 Gbits/sec 0 918 KBytes
[ 4] 6.00-7.00 sec 1.15 GBytes 9.90 Gbits/sec 0 918 KBytes
[ 4] 7.00-8.00 sec 1.15 GBytes 9.90 Gbits/sec 0 918 KBytes
[ 4] 8.00-9.00 sec 1.15 GBytes 9.90 Gbits/sec 0 918 KBytes
[ 4] 9.00-10.00 sec 1.15 GBytes 9.91 Gbits/sec 0 918 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 11.5 GBytes 9.90 Gbits/sec 0 sender
[ 4] 0.00-10.00 sec 11.5 GBytes 9.90 Gbits/sec receiver
On Tue, 10 Apr 2018 at 08:49, Steven Vacaroaia <stef97@xxxxxxxxx> wrote:
Hi,Thanks for providing guidanceVD0 is the SSD driveMany people suggested to not enable WB for SSD so that cache can be used for HDD where is needed moreSetup is 3 identical DELL R620 server OSD01, OSD02, OSD0410 GB separate networks, 600 GB Entreprise HDD , 320 GB Entreprise SSDBlustore, separate WAL / DB on SSD ( 1 GB partition for WAL, 30GB for DB)With 2 OSD per servers and only OSD01, OSD02 , performance is as expected ( no gaps CUR MB/s )Adding one OSD from OSD04, tanks performance ( lots of gaps CUR MB/s 0 )See belowceph -scluster:id: 1e98e57a-ef41-4327-b88a-dd2531912632health: HEALTH_WARNnoscrub,nodeep-scrub flag(s) setWITH OSD04ceph osd treeID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF-1 2.87256 root default-7 1.14899 host osd020 hdd 0.57500 osd.0 up 1.00000 1.000001 hdd 0.57500 osd.1 up 1.00000 1.00000-3 1.14899 host osd032 hdd 0.57500 osd.2 up 1.00000 1.000003 hdd 0.57500 osd.3 up 1.00000 1.00000-4 0.57458 host osd044 hdd 0.57458 osd.4 up 1.00000 1.000002018-04-10 08:37:08.111037 min lat: 0.0128562 max lat: 13.1623 avg lat: 0.528273sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)100 16 3001 2985 119.388 90 0.0169507 0.528273101 16 3029 3013 119.315 112 0.0410565 0.524325102 16 3029 3013 118.145 0 - 0.524325103 16 3029 3013 116.998 0 - 0.524325104 16 3029 3013 115.873 0 - 0.524325105 16 3071 3055 116.37 42 0.0888923 0.54832106 16 3156 3140 118.479 340 0.0162464 0.535244107 16 3156 3140 117.372 0 - 0.535244108 16 3156 3140 116.285 0 - 0.535244109 16 3156 3140 115.218 0 - 0.535244110 16 3156 3140 114.171 0 - 0.535244111 16 3156 3140 113.142 0 - 0.535244112 16 3156 3140 112.132 0 - 0.535244113 16 3156 3140 111.14 0 - 0.535244114 16 3156 3140 110.165 0 - 0.535244115 16 3156 3140 109.207 0 - 0.535244116 16 3230 3214 110.817 29.6 0.0169969 0.574856117 16 3311 3295 112.639 324 0.0704851 0.565529118 16 3311 3295 111.684 0 - 0.565529119 16 3311 3295 110.746 0 - 0.5655292018-04-10 08:37:28.112886 min lat: 0.0128562 max lat: 14.7293 avg lat: 0.565529sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)120 16 3311 3295 109.823 0 - 0.565529121 16 3311 3295 108.915 0 - 0.565529122 16 3311 3295 108.022 0 - 0.565529Total time run: 122.568983Total writes made: 3312Write size: 4194304Object size: 4194304Bandwidth (MB/sec): 108.086Stddev Bandwidth: 121.191Max bandwidth (MB/sec): 520Min bandwidth (MB/sec): 0Average IOPS: 27Stddev IOPS: 30Max IOPS: 130Min IOPS: 0Average Latency(s): 0.591771Stddev Latency(s): 1.74753Max latency(s): 14.7293Min latency(s): 0.0128562AFTER ceph osd down osd.4; ceph osd out osd.4ceph osd treeID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF-1 2.87256 root default-7 1.14899 host osd020 hdd 0.57500 osd.0 up 1.00000 1.000001 hdd 0.57500 osd.1 up 1.00000 1.00000-3 1.14899 host osd032 hdd 0.57500 osd.2 up 1.00000 1.000003 hdd 0.57500 osd.3 up 1.00000 1.00000-4 0.57458 host osd044 hdd 0.57458 osd.4 up 0 1.000002018-04-10 08:46:55.193642 min lat: 0.0156532 max lat: 2.5884 avg lat: 0.310681sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)100 16 5144 5128 205.097 220 0.0372222 0.310681101 16 5196 5180 205.126 208 0.421245 0.310908102 16 5232 5216 204.526 144 0.543723 0.311544103 16 5271 5255 204.055 156 0.465998 0.312394104 16 5310 5294 203.593 156 0.483188 0.313355105 16 5357 5341 203.444 188 0.0313209 0.313267106 16 5402 5386 203.223 180 0.517098 0.313714107 16 5457 5441 203.379 220 0.0277359 0.313288108 16 5515 5499 203.644 232 0.470556 0.313203109 16 5565 5549 203.611 200 0.564713 0.313173110 16 5606 5590 203.25 164 0.0223166 0.313596111 16 5659 5643 203.329 212 0.0231103 0.313597112 16 5703 5687 203.085 176 0.033348 0.314018113 16 5757 5741 203.199 216 1.53862 0.313991114 16 5798 5782 202.855 164 0.4711 0.314511115 16 5852 5836 202.969 216 0.0350226 0.31424116 16 5912 5896 203.288 240 0.0253188 0.313657117 16 5964 5948 203.328 208 0.0223623 0.313562118 16 6024 6008 203.639 240 0.174245 0.313531119 16 6070 6054 203.473 184 0.712498 0.3135822018-04-10 08:47:15.195873 min lat: 0.0154679 max lat: 2.5884 avg lat: 0.313564sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)120 16 6120 6104 203.444 200 0.0351212 0.313564Total time run: 120.551897Total writes made: 6120Write size: 4194304Object size: 4194304Bandwidth (MB/sec): 203.066Stddev Bandwidth: 43.8329Max bandwidth (MB/sec): 480Min bandwidth (MB/sec): 128Average IOPS: 50Stddev IOPS: 10Max IOPS: 120Min IOPS: 32Average Latency(s): 0.314959Stddev Latency(s): 0.379298Max latency(s): 2.5884Min latency(s): 0.0154679On Tue, 10 Apr 2018 at 07:58, Kai Wagner <kwagner@xxxxxxxx> wrote:Is this just from one server or from all servers? Just wondering why VD
0 is using WriteThrough compared to the others. If that's the setup for
the OSD's you already have a cache setup problem.
On 10.04.2018 13:44, Mohamad Gebai wrote:
> megacli -LDGetProp -cache -Lall -a0
>
> Adapter 0-VD 0(target id: 0): Cache Policy:WriteThrough,
> ReadAheadNone, Direct, Write Cache OK if bad BBU
> Adapter 0-VD 1(target id: 1): Cache Policy:WriteBack, ReadAdaptive,
> Cached, No Write Cache if bad BBU
> Adapter 0-VD 2(target id: 2): Cache Policy:WriteBack, ReadAdaptive,
> Cached, No Write Cache if bad BBU
> Adapter 0-VD 3(target id: 3): Cache Policy:WriteBack, ReadAdaptive,
> Cached, No Write Cache if bad BBU
--
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com