Re: extract disk usage stats from running ceph cluster

"Joe Comeau" <Joe.Comeau@xxxxxxxxxx> · Mon, 10 Feb 2020 12:55:08 -0800

try from admin node

ceph osd df
ceph osd status
thanks Joe

>>> <ceph@xxxxxxxxxx> 2/10/2020 10:44 AM >>>
Hello MJ,

Perhaps your PGs are a unbalanced?

Ceph osd df tree

Greetz
Mehmet 

Am 10. Februar 2020 14:58:25 MEZ schrieb lists <lists@xxxxxxxxxxxxx>:
>Hi,
>
>We would like to replace the current seagate ST4000NM0034 HDDs in our 
>ceph cluster with SSDs, and before doing that, we would like to
>checkout 
>the typical usage of our current drives, over the last years, so we can
>
>select the best (price/performance/endurance) SSD to replace them with.
>
>I am trying to extract this info from the fields "Blocks received from 
>initiator" / "blocks sent to initiator", as these are the fields 
>smartctl gets from the seagate disks. But the numbers seem strange, and
>
>I would like to request feedback here.
>
>Three nodes, all equal, 8 OSDs per node, all 4TB ST4000NM0034 
>(filestore) HDDs with SSD-based journals:
>
>> root@node1:~# ceph osd crush tree
>> ID CLASS WEIGHT   TYPE NAME
>> -1	   87.35376 root default
>> -2	   29.11688     host node1
>>  0   hdd  3.64000		 osd.0
>>  1   hdd  3.64000		 osd.1
>>  2   hdd  3.63689		 osd.2
>>  3   hdd  3.64000		 osd.3
>> 12   hdd  3.64000 	    osd.12
>> 13   hdd  3.64000 	    osd.13
>> 14   hdd  3.64000 	    osd.14
>> 15   hdd  3.64000 	    osd.15
>> -3	   29.12000     host node2
>>  4   hdd  3.64000		 osd.4
>>  5   hdd  3.64000		 osd.5
>>  6   hdd  3.64000		 osd.6
>>  7   hdd  3.64000		 osd.7
>> 16   hdd  3.64000 	    osd.16
>> 17   hdd  3.64000 	    osd.17
>> 18   hdd  3.64000 	    osd.18
>> 19   hdd  3.64000 	    osd.19
>> -4	   29.11688     host node3
>>  8   hdd  3.64000		 osd.8
>>  9   hdd  3.64000		 osd.9
>> 10   hdd  3.64000 	    osd.10
>> 11   hdd  3.64000 	    osd.11
>> 20   hdd  3.64000 	    osd.20
>> 21   hdd  3.64000 	    osd.21
>> 22   hdd  3.64000 	    osd.22
>> 23   hdd  3.63689 	    osd.23
>
>We are looking at the numbers from smartctl, and basing our
>calculations 
>on this output for each individual various OSD:
>> Vendor (Seagate) cache information
>>   Blocks sent to initiator = 3783529066
>>   Blocks received from initiator = 3121186120
>>   Blocks read from cache and sent to initiator = 545427169
>>   Number of read and write commands whose size <= segment size =
>93877358
>>   Number of read and write commands whose size > segment size =
>2290879
>
>I created the following spreadsheet:
>
>> 	blocks sent	blocks received	total blocks	
>> 	 to initiator    from initiator    calculated    read%    write%   	 aka
>> node1
>> osd0	905060564	1900663448	2805724012	32,26%	67,74%		sda
>> osd1	2270442418	3756215880	6026658298	37,67%	62,33%		sdb
>> osd2	3531938448	3940249192	7472187640	47,27%	52,73%		sdc
>> osd3	2824808123	3130655416	5955463539	47,43%	52,57%		sdd
>> osd12	1956722491	1294854032	3251576523	60,18%	39,82%		sdg
>> osd13	3410188306	1265443936	4675632242	72,94%	27,06%		sdh
>> osd14	3765454090	3115079112	6880533202	54,73%	45,27%		sdi
>> osd15	2272246730	2218847264	4491093994	50,59%	49,41%		sdj
>> 							
>> node2							
>> osd4	3974937107	740853712	4715790819	84,29%	15,71%		sda
>> osd5	1181377668	2109150744	3290528412	35,90%	64,10%		sdb
>> osd5	1903438106	608869008	2512307114	75,76%	24,24%		sdc
>> osd7	3511170043	724345936	4235515979	82,90%	17,10%		sdd
>> osd16	2642731906	3981984640	6624716546	39,89%	60,11%		sdg
>> osd17	3994977805	3703856288	7698834093	51,89%	48,11%		sdh
>> osd18	3992157229	2096991672	6089148901	65,56%	34,44%		sdi
>> osd19	279766405	1053039640	1332806045	20,99%	79,01%		sdj
>> 							
>> node3							
>> osd8	3711322586	234696960	3946019546	94,05%	5,95%		sda
>> osd9	1203912715	3132990000	4336902715	27,76%	72,24%		sdb
>> osd10	912356010	1681434416	2593790426	35,17%	64,83%		sdc
>> osd11	810488345	2626589896	3437078241	23,58%	76,42%		sdd
>> osd20	1506879946	2421596680	3928476626	38,36%	61,64%		sdg
>> osd21	2991526593	7525120		2999051713	99,75%	0,25%		sdh
>> osd22	29560337	3226114552	3255674889	0,91%	99,09%		sdi
>> osd23	2019195656	2563506320	4582701976	44,06%	55,94%		sdj
>
>But as can be seen above, this results in some very strange numbers,
>for 
>example node3/osd21 and node2/osd19, node3/osd8, the numbers are
>unlikely.
>
>So, probably we're doing something wrong in our logic here.
>
>Can someone explain what we're doing wrong, and is it possible to
>obtain 
>stats like these also from ceph directly? Does ceph keep historical 
>stats like above..?
>
>MJ
>_______________________________________________
>ceph-users mailing list -- ceph-users@xxxxxxx
>To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx