Re: extract disk usage stats from running ceph cluster

mj <lists@xxxxxxxxxxxxx> · Wed, 12 Feb 2020 11:23:09 +0100

Hi Muhammad,

Yes, that tool helps! Thank you for pointing it out!

With a combination of openSeaChest_Info and smartctl I was able to 
extract the following stats of our cluster, and the numbers are very 
surprising to me. I hope someone here can explain the what we see below:

node1	AnnualWrkload	Read	Written		Power On Hours	
osd0	 93.14		318.79	  19.48		31815.65	
osd1	 94.38		322.67	  20.11		31815.42	
osd2	 41.08		 38.95	  11.33		10722.47	new disk
osd3	 94.56		323.98	  19.45		31815.35	
osd12	124.20		340.11	  20.09		25406.73	
osd13	112.43		308.18	  17.88		25405.72	
sdb14	120.67		330.96	  19.01		25405.65	
osd15	105.59		287.78	  18.45		25405.90	
ssd journal		  0.46	1643.58		31813.00	

node2					
osd4	697.75		2390	 151.23		31864.88	(2.39PB)
osd5	677.74		2320	 144.94		31864.68	(2.32PB)
osd6	687.13		2340	 157.11		31865.05	(2.34PB)
osd7	619.19		2100	 151.08		31864.67	(2.10PB)
osd16	827.57		2260	 142.81		25405.93	(2.26PB)
osd17	996.03		2720	 167.97		25405.87	(2.72PB)
osd18	809.36		2210	 137.96		25405.82	(2.21PB)
osd19	844.06		2300	 146.84		25405.90	(2.30PB)
ssd journal		0.46	1637.60		31862.00	

node3					
osd8	 75.30		258.79	  14.67		31813.67	
osd9	 77.30		264.87	  15.85		31813.68	
osd10	 82.32		282.43	  16.53		31813.60	
osd11	 82.26		282.72	  16.01		31813.73	
osd20	 96.86		265.25	  15.65		25404.37	
osd21	 93.18		256.11	  14.12		25404.22	
osd22	108.43		298.29	  16.15		25404.23	
osd23	 30.80		 33.61	  10.78		12625.07	new disk
ssd journal		  0.46	1644.83		31811.00	
AnnualWrkload = Annualized Workload Rate (TB/year)
Read = Total Bytes Read (TB)
Written = Total Bytes Written (TB)
Power On Hours = hours the drive has been used

From the numbers above, it seems the OSDs on node2 are used INCREDIBLY 
much more than those on the other two nodes. The numbers for node2 are 
even reported in PB, and the other nodes in TB. (converted to TB using 
https://www.gbmb.org/pb-to-tb, to make sure there are no conversion errors)

However, SSD journal usage across the three nodes looks similar.

All OSDs have the same weight:
root@node2:~# ceph osd tree
ID CLASS WEIGHT   TYPE NAME      STATUS REWEIGHT PRI-AFF 
-1       87.35376 root default                           
-2       29.11688     host pm1                           
 0   hdd  3.64000         osd.0      up  1.00000 1.00000 
 1   hdd  3.64000         osd.1      up  1.00000 1.00000 
 2   hdd  3.63689         osd.2      up  1.00000 1.00000 
 3   hdd  3.64000         osd.3      up  1.00000 1.00000 
12   hdd  3.64000         osd.12     up  1.00000 1.00000 
13   hdd  3.64000         osd.13     up  1.00000 1.00000 
14   hdd  3.64000         osd.14     up  1.00000 1.00000 
15   hdd  3.64000         osd.15     up  1.00000 1.00000 
-3       29.12000     host pm2                           
 4   hdd  3.64000         osd.4      up  1.00000 1.00000 
 5   hdd  3.64000         osd.5      up  1.00000 1.00000 
 6   hdd  3.64000         osd.6      up  1.00000 1.00000 
 7   hdd  3.64000         osd.7      up  1.00000 1.00000 
16   hdd  3.64000         osd.16     up  1.00000 1.00000 
17   hdd  3.64000         osd.17     up  1.00000 1.00000 
18   hdd  3.64000         osd.18     up  1.00000 1.00000 
19   hdd  3.64000         osd.19     up  1.00000 1.00000 
-4       29.11688     host pm3                           
 8   hdd  3.64000         osd.8      up  1.00000 1.00000 
 9   hdd  3.64000         osd.9      up  1.00000 1.00000 
10   hdd  3.64000         osd.10     up  1.00000 1.00000 
11   hdd  3.64000         osd.11     up  1.00000 1.00000 
20   hdd  3.64000         osd.20     up  1.00000 1.00000 
21   hdd  3.64000         osd.21     up  1.00000 1.00000 
22   hdd  3.64000         osd.22     up  1.00000 1.00000 
23   hdd  3.63689         osd.23     up  1.00000 1.00000

Disk usage also looks ok:
root@pm2:~# ceph osd df
ID CLASS WEIGHT  REWEIGHT SIZE    USE     AVAIL   %USE  VAR  PGS 
 0   hdd 3.64000  1.00000 3.64TiB 2.01TiB 1.62TiB 55.34 0.98 137 
 1   hdd 3.64000  1.00000 3.64TiB 2.09TiB 1.54TiB 57.56 1.02 141 
 2   hdd 3.63689  1.00000 3.64TiB 1.92TiB 1.72TiB 52.79 0.94 128 
 3   hdd 3.64000  1.00000 3.64TiB 2.07TiB 1.57TiB 56.90 1.01 143 
12   hdd 3.64000  1.00000 3.64TiB 2.15TiB 1.48TiB 59.18 1.05 138 
13   hdd 3.64000  1.00000 3.64TiB 1.99TiB 1.64TiB 54.80 0.97 131 
14   hdd 3.64000  1.00000 3.64TiB 1.93TiB 1.70TiB 53.13 0.94 127 
15   hdd 3.64000  1.00000 3.64TiB 2.19TiB 1.45TiB 60.10 1.07 143 
 4   hdd 3.64000  1.00000 3.64TiB 2.11TiB 1.53TiB 57.97 1.03 142 
 5   hdd 3.64000  1.00000 3.64TiB 1.97TiB 1.67TiB 54.11 0.96 134 
 6   hdd 3.64000  1.00000 3.64TiB 2.12TiB 1.51TiB 58.40 1.04 142 
 7   hdd 3.64000  1.00000 3.64TiB 1.97TiB 1.66TiB 54.28 0.97 128 
16   hdd 3.64000  1.00000 3.64TiB 2.00TiB 1.64TiB 54.90 0.98 133 
17   hdd 3.64000  1.00000 3.64TiB 2.33TiB 1.30TiB 64.14 1.14 153 
18   hdd 3.64000  1.00000 3.64TiB 1.97TiB 1.67TiB 54.07 0.96 132 
19   hdd 3.64000  1.00000 3.64TiB 1.89TiB 1.75TiB 51.94 0.92 124 
 8   hdd 3.64000  1.00000 3.64TiB 1.79TiB 1.85TiB 49.24 0.88 123 
 9   hdd 3.64000  1.00000 3.64TiB 2.17TiB 1.46TiB 59.72 1.06 144 
10   hdd 3.64000  1.00000 3.64TiB 2.40TiB 1.24TiB 65.88 1.17 157 
11   hdd 3.64000  1.00000 3.64TiB 2.06TiB 1.58TiB 56.64 1.01 133 
20   hdd 3.64000  1.00000 3.64TiB 2.19TiB 1.45TiB 60.23 1.07 148 
21   hdd 3.64000  1.00000 3.64TiB 1.74TiB 1.90TiB 47.80 0.85 115 
22   hdd 3.64000  1.00000 3.64TiB 2.05TiB 1.59TiB 56.27 1.00 138 
23   hdd 3.63689  1.00000 3.64TiB 1.96TiB 1.67TiB 54.01 0.96 130 
                    TOTAL 87.3TiB 49.1TiB 38.2TiB 56.23          
MIN/MAX VAR: 0.85/1.17  STDDEV: 4.08

The cluster is HEALTH_OK and seems to be  working fine.

When comparing "iostat -x 1" between node2 and the other two, we see 
similar %util for all OSDs across all nodes.

How can the reported disk stats for node2 be SO different than the other 
two nodes, whereas for the rest everything seems to be running as it should?

Or are we missing something?

Thanks!

MJ
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx