SSD Primary Affinity

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

I am looking at a way to scale performance and usable space using something like Primary Affinity to effectively use 3x replication across 1 primary SSD OSD, and 2 replicated HDD OSD’s.

Assuming production level, we would keep a pretty close 1:2 SSD:HDD ratio, but looking to experiment with some of my non-production cluster.
Jewel 10.2.6; Ubuntu 16.04; 4.4 kernel, with some 4.8 kernel; 2x10G ethernet to each node.

First of all, is this even a valid architecture decision? Obviously SSD’s will wear quicker, but with proper endurance ratings (3-5 DWPD) and keeping an eye on them, it should boost performance levels considerably compared to spinning disks, while allowing me to provide that performance at roughly half the cost of all-flash to achieve the capacity levels I am looking to hit.

Currently in CRUSH, I have an all HDD root level, and an all SSD root level, for this ceph cluster.

I assume that I can not cross-pollinate these two root-level crush tiers to do quick performance benchmarks?

And I am assuming that if I wanted to actually make this work in production, I would set primary affinity for any SSD OSD that I want the acting copy of my PG’s to live on (SSD) to be 1, and set the primary affinity for my HDD OSD’s to be 0, and CRUSH would figure that out to get all of the data to size=3.

Does anyone have this running in production?
Anyone have any comments/concerns/issues with this?
Any comparisons between this and cache-tiering?

Workload is pretty simple, mostly RADOS object store, with CephFS as well.
We have found that the 8TB HDDs were not very conducive to our workloads in testing, which got better with more scale, but was still very slow (even with NVMe journals).
And for the record, these are the Seagate Enterprise Capacity drives, so PMR, not SMR (ST8000NM0065).

So trying to find the easiest way that I can test/benchmark the feasibility of this hybrid/primary affinity architecture in the lab to get a better understanding moving forward.

Any insight is appreciated.

Thanks,

Reed


$ ceph osd tree
ID  WEIGHT    TYPE NAME                     UP/DOWN REWEIGHT PRIMARY-AFFINITY
-13  52.37358 root ssd
-11  52.37358     rack ssd.rack2
-14  17.45700         host ceph00
 24   1.74599             osd.24                 up  1.00000          1.00000
 25   1.74599             osd.25                 up  1.00000          1.00000
 26   1.74599             osd.26                 up  1.00000          1.00000
 27   1.74599             osd.27                 up  1.00000          1.00000
 28   1.74599             osd.28                 up  1.00000          1.00000
 29   1.74599             osd.29                 up  1.00000          1.00000
 30   1.74599             osd.30                 up  1.00000          1.00000
 31   1.74599             osd.31                 up  1.00000          1.00000
 32   1.74599             osd.32                 up  1.00000          1.00000
 33   1.74599             osd.33                 up  1.00000          1.00000
-15  17.45700         host ceph01
 34   1.74599             osd.34                 up  1.00000          1.00000
 35   1.74599             osd.35                 up  1.00000          1.00000
 36   1.74599             osd.36                 up  1.00000          1.00000
 37   1.74599             osd.37                 up  1.00000          1.00000
 38   1.74599             osd.38                 up  1.00000          1.00000
 39   1.74599             osd.39                 up  1.00000          1.00000
 40   1.74599             osd.40                 up  1.00000          1.00000
 41   1.74599             osd.41                 up  1.00000          1.00000
 42   1.74599             osd.42                 up  1.00000          1.00000
 43   1.74599             osd.43                 up  1.00000          1.00000
-16  17.45958         host ceph02
 45   1.74599             osd.45                 up  1.00000          1.00000
 46   1.74599             osd.46                 up  1.00000          1.00000
 47   1.74599             osd.47                 up  1.00000          1.00000
 48   1.74599             osd.48                 up  1.00000          1.00000
 49   1.74599             osd.49                 up  1.00000          1.00000
 50   1.74599             osd.50                 up  1.00000          1.00000
 51   1.74599             osd.51                 up  1.00000          1.00000
 52   1.74599             osd.52                 up  1.00000          1.00000
 53   1.74599             osd.53                 up  1.00000          1.00000
 44   1.74570             osd.44                 up  1.00000          1.00000
-10         0 rack default.rack2
-12         0     chassis default.rack2.U16
 -1 174.51492 root default
 -2  21.81000     host node24
  0   7.26999         osd.0                      up  1.00000          1.00000
  8   7.26999         osd.8                      up  1.00000          1.00000
 16   7.26999         osd.16                     up  1.00000          1.00000
 -3  21.81000     host node25
  1   7.26999         osd.1                      up  1.00000          1.00000
  9   7.26999         osd.9                      up  1.00000          1.00000
 17   7.26999         osd.17                     up  1.00000          1.00000
 -4  21.81999     host node26
 10   7.26999         osd.10                     up  1.00000          1.00000
 18   7.27499         osd.18                     up  1.00000          1.00000
  2   7.27499         osd.2                      up  1.00000          1.00000
 -5  21.81499     host node27
  3   7.26999         osd.3                      up  1.00000          1.00000
 11   7.26999         osd.11                     up  1.00000          1.00000
 19   7.27499         osd.19                     up  1.00000          1.00000
 -6  21.81499     host node28
  4   7.26999         osd.4                      up  1.00000          1.00000
 12   7.26999         osd.12                     up  1.00000          1.00000
 20   7.27499         osd.20                     up  1.00000          1.00000
 -7  21.81499     host node29
  5   7.26999         osd.5                      up  1.00000          1.00000
 13   7.26999         osd.13                     up  1.00000          1.00000
 21   7.27499         osd.21                     up  1.00000          1.00000
 -8  21.81499     host node30
  6   7.26999         osd.6                      up  1.00000          1.00000
 14   7.26999         osd.14                     up  1.00000          1.00000
 22   7.27499         osd.22                     up  1.00000          1.00000
 -9  21.81499     host node31
  7   7.26999         osd.7                      up  1.00000          1.00000
 15   7.26999         osd.15                     up  1.00000          1.00000
 23   7.27499         osd.23                     up  1.00000          1.00000
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux