Hi all,
I am looking at a way to scale performance and usable space using something like Primary Affinity to effectively use 3x replication across 1 primary SSD OSD, and 2 replicated HDD OSD’s.
Assuming production level, we would keep a pretty close 1:2 SSD:HDD ratio, but looking to experiment with some of my non-production cluster.
Jewel 10.2.6; Ubuntu 16.04; 4.4 kernel, with some 4.8 kernel; 2x10G ethernet to each node.
First of all, is this even a valid architecture decision? Obviously SSD’s will wear quicker, but with proper endurance ratings (3-5 DWPD) and keeping an eye on them, it should boost performance levels considerably compared to spinning disks, while allowing me to provide that performance at roughly half the cost of all-flash to achieve the capacity levels I am looking to hit.
Currently in CRUSH, I have an all HDD root level, and an all SSD root level, for this ceph cluster.
I assume that I can not cross-pollinate these two root-level crush tiers to do quick performance benchmarks?
And I am assuming that if I wanted to actually make this work in production, I would set primary affinity for any SSD OSD that I want the acting copy of my PG’s to live on (SSD) to be 1, and set the primary affinity for my HDD OSD’s to be 0, and CRUSH would figure that out to get all of the data to size=3.
Does anyone have this running in production?
Anyone have any comments/concerns/issues with this?
Any comparisons between this and cache-tiering?
Workload is pretty simple, mostly RADOS object store, with CephFS as well.
We have found that the 8TB HDDs were not very conducive to our workloads in testing, which got better with more scale, but was still very slow (even with NVMe journals).
And for the record, these are the Seagate Enterprise Capacity drives, so PMR, not SMR (ST8000NM0065).
So trying to find the easiest way that I can test/benchmark the feasibility of this hybrid/primary affinity architecture in the lab to get a better understanding moving forward.
Any insight is appreciated.
Thanks,
Reed
$ ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-13 52.37358 root ssd
-11 52.37358 rack ssd.rack2
-14 17.45700 host ceph00
24 1.74599 osd.24 up 1.00000 1.00000
25 1.74599 osd.25 up 1.00000 1.00000
26 1.74599 osd.26 up 1.00000 1.00000
27 1.74599 osd.27 up 1.00000 1.00000
28 1.74599 osd.28 up 1.00000 1.00000
29 1.74599 osd.29 up 1.00000 1.00000
30 1.74599 osd.30 up 1.00000 1.00000
31 1.74599 osd.31 up 1.00000 1.00000
32 1.74599 osd.32 up 1.00000 1.00000
33 1.74599 osd.33 up 1.00000 1.00000
-15 17.45700 host ceph01
34 1.74599 osd.34 up 1.00000 1.00000
35 1.74599 osd.35 up 1.00000 1.00000
36 1.74599 osd.36 up 1.00000 1.00000
37 1.74599 osd.37 up 1.00000 1.00000
38 1.74599 osd.38 up 1.00000 1.00000
39 1.74599 osd.39 up 1.00000 1.00000
40 1.74599 osd.40 up 1.00000 1.00000
41 1.74599 osd.41 up 1.00000 1.00000
42 1.74599 osd.42 up 1.00000 1.00000
43 1.74599 osd.43 up 1.00000 1.00000
-16 17.45958 host ceph02
45 1.74599 osd.45 up 1.00000 1.00000
46 1.74599 osd.46 up 1.00000 1.00000
47 1.74599 osd.47 up 1.00000 1.00000
48 1.74599 osd.48 up 1.00000 1.00000
49 1.74599 osd.49 up 1.00000 1.00000
50 1.74599 osd.50 up 1.00000 1.00000
51 1.74599 osd.51 up 1.00000 1.00000
52 1.74599 osd.52 up 1.00000 1.00000
53 1.74599 osd.53 up 1.00000 1.00000
44 1.74570 osd.44 up 1.00000 1.00000
-10 0 rack default.rack2
-12 0 chassis default.rack2.U16
-1 174.51492 root default
-2 21.81000 host node24
0 7.26999 osd.0 up 1.00000 1.00000
8 7.26999 osd.8 up 1.00000 1.00000
16 7.26999 osd.16 up 1.00000 1.00000
-3 21.81000 host node25
1 7.26999 osd.1 up 1.00000 1.00000
9 7.26999 osd.9 up 1.00000 1.00000
17 7.26999 osd.17 up 1.00000 1.00000
-4 21.81999 host node26
10 7.26999 osd.10 up 1.00000 1.00000
18 7.27499 osd.18 up 1.00000 1.00000
2 7.27499 osd.2 up 1.00000 1.00000
-5 21.81499 host node27
3 7.26999 osd.3 up 1.00000 1.00000
11 7.26999 osd.11 up 1.00000 1.00000
19 7.27499 osd.19 up 1.00000 1.00000
-6 21.81499 host node28
4 7.26999 osd.4 up 1.00000 1.00000
12 7.26999 osd.12 up 1.00000 1.00000
20 7.27499 osd.20 up 1.00000 1.00000
-7 21.81499 host node29
5 7.26999 osd.5 up 1.00000 1.00000
13 7.26999 osd.13 up 1.00000 1.00000
21 7.27499 osd.21 up 1.00000 1.00000
-8 21.81499 host node30
6 7.26999 osd.6 up 1.00000 1.00000
14 7.26999 osd.14 up 1.00000 1.00000
22 7.27499 osd.22 up 1.00000 1.00000
-9 21.81499 host node31
7 7.26999 osd.7 up 1.00000 1.00000
15 7.26999 osd.15 up 1.00000 1.00000
23 7.27499 osd.23 up 1.00000 1.00000
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com