Thanks for the suggestion but , unfortunately, having same number of OSD did not solve the issue
Here is with 2 OSD per server, 3 servers - identical servers and osd configuration
[root@osd01 ~]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 4.02173 root default
-9 1.14917 host osd01
5 hdd 0.57458 osd.5 up 1.00000 1.00000
6 hdd 0.57458 osd.6 up 1.00000 1.00000
-7 1.14899 host osd02
0 hdd 0.57500 osd.0 up 1.00000 1.00000
1 hdd 0.57500 osd.1 up 1.00000 1.00000
-3 1.14899 host osd03
2 hdd 0.57500 osd.2 up 1.00000 1.00000
3 hdd 0.57500 osd.3 up 1.00000 1.00000
-4 0.57458 host osd04
4 hdd 0.57458 osd.4 up 0 1.00000
[root@osd01 ~]# ceph osd df tree
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME
-1 4.02173 - 1176G 89108M 1089G 0 0 - root default
-9 1.14917 - 1176G 89108M 1089G 7.39 1.02 - host osd01
5 hdd 0.57458 1.00000 588G 44498M 544G 7.39 1.02 47 osd.5
6 hdd 0.57458 1.00000 588G 44610M 544G 7.40 1.02 46 osd.6
-7 1.14899 - 1176G 84472M 1094G 7.01 0.96 - host osd02
0 hdd 0.57500 1.00000 588G 42290M 547G 7.02 0.97 35 osd.0
1 hdd 0.57500 1.00000 588G 42182M 547G 7.00 0.96 37 osd.1
-3 1.14899 - 1176G 89320M 1089G 7.41 1.02 - host osd03
2 hdd 0.57500 1.00000 588G 45370M 544G 7.53 1.04 50 osd.2
3 hdd 0.57500 1.00000 588G 43950M 545G 7.29 1.00 41 osd.3
-4 0.57458 - 0 0 0 0 0 - host osd04
4 hdd 0.57458 0 0 0 0 0 0 0 osd.4
TOTAL 4118G 287G 3830G 7.27
MIN/MAX VAR: 0.96/1.04 STDDEV: 0.20
[root@osd01 ~]# rados bench -p rbd 120 write --no-cleanup && rados bench -p rbd 120 seq
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 120 seconds or 0 objects
Object prefix: benchmark_data_osd01.tor.medavail.net_83835
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
1 16 52 36 143.993 144 0.01749 0.0387657
2 16 52 36 71.9932 0 - 0.0387657
3 16 62 46 61.3276 20 0.0241346 0.254428
4 16 104 88 87.9915 168 0.0135851 0.646529
5 16 121 105 83.9918 68 0.0152886 0.551564
6 16 131 115 76.6591 40 0.0174347 0.517888
7 16 131 115 65.7078 0 - 0.517888
8 16 152 136 67.9934 42 0.0178455 0.674487
9 16 209 193 85.7693 228 0.0202116 0.640473
10 16 216 200 79.992 28 0.0172787 0.619349
11 16 229 213 77.4468 52 0.0160566 0.674538
12 16 229 213 70.9929 0 - 0.674538
13 16 229 213 65.532 0 - 0.674538
14 16 263 247 70.5645 45.3333 0.127854 0.734526
15 16 272 256 68.26 36 0.044047 0.772968
16 16 282 266 66.4934 40 0.055596 0.753213
17 16 298 282 66.3464 64 0.0185164 0.906061
18 16 303 287 63.7714 20 0.0163462 0.907965
19 16 350 334 70.3088 188 0.0320304 0.907601
2018-04-11 08:46:46.521478 min lat: 0.0135851 max lat: 9.31766 avg lat: 0.807083
On Wed, 11 Apr 2018 at 01:57, Konstantin Shalygin <k0ste@xxxxxxxx> wrote:
> ceph osd df tree
> ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME
> -1 3.44714 - 588G 80693M 509G 0 0 - root default
> -9 0.57458 - 588G 80693M 509G 13.39 1.13 - host osd01
> 5 hdd 0.57458 1.00000 588G 80693M 509G 13.39 1.13 64 osd.5
> -7 1.14899 - 1176G 130G 1046G 11.06 0.94 - host osd02
> 0 hdd 0.57500 1.00000 588G 70061M 519G 11.63 0.98 50 osd.0
> 1 hdd 0.57500 1.00000 588G 63200M 526G 10.49 0.89 41 osd.1
> -3 1.14899 - 1176G 138G 1038G 11.76 1.00 - host osd03
> 2 hdd 0.57500 1.00000 588G 68581M 521G 11.38 0.96 48 osd.2
> 3 hdd 0.57500 1.00000 588G 73185M 516G 12.15 1.03 53 osd.3
> -4 0.57458 - 0 0 0 0 0 - host osd04
> 4 hdd 0.57458 0 0 0 0 0 0 0 osd.4
By adding new hosts with half of osds of present hosts you a imbalance
your crush.
osd.4 and osd.5 do double work in compare with present hosts if your
failure domain is host.
k
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com