extract disk usage stats from running ceph cluster

lists <lists@xxxxxxxxxxxxx> · Mon, 10 Feb 2020 14:58:25 +0100

Hi,

We would like to replace the current seagate ST4000NM0034 HDDs in our 
ceph cluster with SSDs, and before doing that, we would like to checkout 
the typical usage of our current drives, over the last years, so we can 
select the best (price/performance/endurance) SSD to replace them with.

I am trying to extract this info from the fields "Blocks received from 
initiator" / "blocks sent to initiator", as these are the fields 
smartctl gets from the seagate disks. But the numbers seem strange, and 
I would like to request feedback here.

Three nodes, all equal, 8 OSDs per node, all 4TB ST4000NM0034 
(filestore) HDDs with SSD-based journals:

root@node1:~# ceph osd crush tree
ID CLASS WEIGHT   TYPE NAME
-1       87.35376 root default
-2       29.11688     host node1
 0   hdd  3.64000         osd.0
 1   hdd  3.64000         osd.1
 2   hdd  3.63689         osd.2
 3   hdd  3.64000         osd.3
12   hdd  3.64000         osd.12
13   hdd  3.64000         osd.13
14   hdd  3.64000         osd.14
15   hdd  3.64000         osd.15
-3       29.12000     host node2
 4   hdd  3.64000         osd.4
 5   hdd  3.64000         osd.5
 6   hdd  3.64000         osd.6
 7   hdd  3.64000         osd.7
16   hdd  3.64000         osd.16
17   hdd  3.64000         osd.17
18   hdd  3.64000         osd.18
19   hdd  3.64000         osd.19
-4       29.11688     host node3
 8   hdd  3.64000         osd.8
 9   hdd  3.64000         osd.9
10   hdd  3.64000         osd.10
11   hdd  3.64000         osd.11
20   hdd  3.64000         osd.20
21   hdd  3.64000         osd.21
22   hdd  3.64000         osd.22
23   hdd  3.63689         osd.23

We are looking at the numbers from smartctl, and basing our calculations 
on this output for each individual various OSD:
Vendor (Seagate) cache information
  Blocks sent to initiator = 3783529066
  Blocks received from initiator = 3121186120
  Blocks read from cache and sent to initiator = 545427169
  Number of read and write commands whose size <= segment size = 93877358
  Number of read and write commands whose size > segment size = 2290879

I created the following spreadsheet:

	blocks sent	blocks received	total blocks	
 	to initiator	from initiator	calculated	read%	write%		aka
node1
osd0	905060564	1900663448	2805724012	32,26%	67,74%		sda
osd1	2270442418	3756215880	6026658298	37,67%	62,33%		sdb
osd2	3531938448	3940249192	7472187640	47,27%	52,73%		sdc
osd3	2824808123	3130655416	5955463539	47,43%	52,57%		sdd
osd12	1956722491	1294854032	3251576523	60,18%	39,82%		sdg
osd13	3410188306	1265443936	4675632242	72,94%	27,06%		sdh
osd14	3765454090	3115079112	6880533202	54,73%	45,27%		sdi
osd15	2272246730	2218847264	4491093994	50,59%	49,41%		sdj

node2							
osd4	3974937107	740853712	4715790819	84,29%	15,71%		sda
osd5	1181377668	2109150744	3290528412	35,90%	64,10%		sdb
osd5	1903438106	608869008	2512307114	75,76%	24,24%		sdc
osd7	3511170043	724345936	4235515979	82,90%	17,10%		sdd
osd16	2642731906	3981984640	6624716546	39,89%	60,11%		sdg
osd17	3994977805	3703856288	7698834093	51,89%	48,11%		sdh
osd18	3992157229	2096991672	6089148901	65,56%	34,44%		sdi
osd19	279766405	1053039640	1332806045	20,99%	79,01%		sdj

node3							
osd8	3711322586	234696960	3946019546	94,05%	5,95%		sda
osd9	1203912715	3132990000	4336902715	27,76%	72,24%		sdb
osd10	912356010	1681434416	2593790426	35,17%	64,83%		sdc
osd11	810488345	2626589896	3437078241	23,58%	76,42%		sdd
osd20	1506879946	2421596680	3928476626	38,36%	61,64%		sdg
osd21	2991526593	7525120		2999051713	99,75%	0,25%		sdh
osd22	29560337	3226114552	3255674889	0,91%	99,09%		sdi
osd23	2019195656	2563506320	4582701976	44,06%	55,94%		sdj

But as can be seen above, this results in some very strange numbers, for 
example node3/osd21 and node2/osd19, node3/osd8, the numbers are unlikely.

So, probably we're doing something wrong in our logic here.

Can someone explain what we're doing wrong, and is it possible to obtain 
stats like these also from ceph directly? Does ceph keep historical 
stats like above..?

MJ
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx