Horace Ng
To: "Paul Emmerich" <paul.emmerich@xxxxxxxx>, "horace" <horace@xxxxxxxxx>
Cc: "ceph-users" <ceph-users@xxxxxxxxxxxxxx>
Sent: Thursday, May 24, 2018 3:46:59 PM
Subject: Re: SSD-primary crush rule doesn't work as intended
It will also only work reliably if you use a single level tree
structure with failure domain "host". If you want say, separate
data center failure domains, you need extra steps to make sure a
SSD host and a HDD host do not get selected from the same DC.
I have done such a layout so it is possible (see my older posts) but you need to be careful when you construct the additional trees that are needed in order to force the correct elections.
In reality however, even if you force all reads to the SSD using
primary affinity, you will soon run out of write IOPS on the HDDs.
To keep up with the SSD's you will need so many HDDs for an
average workload that in order to keep up performance you will not
save any money.
Regards,
Peter
You can't mix HDDs and SSDs in a server if you want to use such a rule.
The new selection step after "emit" can't know what server was selected previously.
Paul
2018-05-23 11:02 GMT+02:00 Horace <horace@xxxxxxxxx>:
Add to the info, I have a slightly modified rule to take advantage of the new storage class.
rule ssd-hybrid {
id 2
type replicated
min_size 1
max_size 10
step take default class ssd
step chooseleaf firstn 1 type host
step emit
step take default class hdd
step chooseleaf firstn -1 type host
step emit
}
Regards,
Horace Ng
----- Original Message -----
From: "horace" <horace@xxxxxxxxx>
To: "ceph-users" <ceph-users@xxxxxxxxxxxxxx>
Sent: Wednesday, May 23, 2018 3:56:20 PM
Subject: SSD-primary crush rule doesn't work as intended
I've set up the rule according to the doc, but some of the PGs are still being assigned to the same host.
http://docs.ceph.com/docs/master/rados/operations/crush-map-edits/
rule ssd-primary {
ruleset 5
type replicated
min_size 5
max_size 10
step take ssd
step chooseleaf firstn 1 type host
step emit
step take platter
step chooseleaf firstn -1 type host
step emit
}
Crush tree:
[root@ceph0 ~]# ceph osd crush tree
ID CLASS WEIGHT TYPE NAME
-1 58.63989 root default
-2 19.55095 host ceph0
0 hdd 2.73000 osd.0
1 hdd 2.73000 osd.1
2 hdd 2.73000 osd.2
3 hdd 2.73000 osd.3
12 hdd 4.54999 osd.12
15 hdd 3.71999 osd.15
18 ssd 0.20000 osd.18
19 ssd 0.16100 osd.19
-3 19.55095 host ceph1
4 hdd 2.73000 osd.4
5 hdd 2.73000 osd.5
6 hdd 2.73000 osd.6
7 hdd 2.73000 osd.7
13 hdd 4.54999 osd.13
16 hdd 3.71999 osd.16
20 ssd 0.16100 osd.20
21 ssd 0.20000 osd.21
-4 19.53799 host ceph2
8 hdd 2.73000 osd.8
9 hdd 2.73000 osd.9
10 hdd 2.73000 osd.10
11 hdd 2.73000 osd.11
14 hdd 3.71999 osd.14
17 hdd 4.54999 osd.17
22 ssd 0.18700 osd.22
23 ssd 0.16100 osd.23
#ceph pg ls-by-pool ssd-hybrid
27.8 1051 0 0 0 0 4399733760 1581 1581 active+clean 2018-05-23 06:20:56.088216 27957'185553 27959:368828 [23,1,11] 23 [23,1,11] 23 27953'182582 2018-05-23 06:20:56.088172 27843'162478 2018-05-20 18:28:20.118632
With osd.23 and osd.11 being assigned on the same host.
Regards,
Horace Ng
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com