osd daemons still reading disks at full speed while there is no pool activity

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello fellow ceph users,

I'm trying to catch ghost here.. On one of our clusters, 6 nodes,
14.2.15, EC pool 4+2, 6*32 SATA bluestore OSDs we got into very strange
state.

The cluster is clean (except for pgs not deep-scrubbed in time warning,
since we've disabled scrubbing while investigating), there is absolutely
no activity on EC pool, but according to atop, all OSDs are still reading
furiously, without any apparent reason. even when increasing osd loglevel,
I don't see anything interesting, except for occasional
2021-11-03 12:04:52.664 7fb8652e3700  5 osd.0 9347 heartbeat osd_stat(store_statfs(0xb80056c0000/0x26b570000/0xe8d7fc00000, data 0x2f0ddd813e8/0x30b0ee60000, compress 0x0/0x0/0x0, omap 0x98b706, meta 0x26abe48fa), peers [1,26,27,34,36,40,44,49,52,55,57,65,69,75,76,78,82,83,87,93,96,97,104,105,107,108,111,112,114,120,121,122,123,135,136,137,143,147,154,156,157,169,171,187,192,196,200,204,208,212,217,218,220,222,224,226,227] op hist [])
and also compactions stats.

trying to sequentially read data from the pool leads to very poor performance (ie 8MB/s)

We've had very similar problem on different cluster (replicated, no EC), when
osdmaps were not pruned correctly, but I checked and those seem to be OK, it's just
OSD are still reading something and I'm unable to find out what.

here's output of crush for one node, others are pretty similar:

 -1       2803.19824        - 2.7 PiB  609 TiB  607 TiB 1.9 GiB  1.9 TiB 2.1 PiB 21.78 1.01   -        root sata                     
 -2        467.19971        - 466 TiB  102 TiB  101 TiB 320 MiB  328 GiB 364 TiB 21.83 1.01   -            host spbstdv1a-sata       
  0   hdd   14.59999  1.00000  15 TiB  3.1 TiB  3.0 TiB 9.5 MiB  9.7 GiB  12 TiB 20.98 0.97  51     up         osd.0                 
  1   hdd   14.59999  1.00000  15 TiB  2.4 TiB  2.4 TiB 7.4 MiB  7.7 GiB  12 TiB 16.34 0.76  50     up         osd.1                 
  2   hdd   14.59999  1.00000  15 TiB  3.5 TiB  3.5 TiB  11 MiB   11 GiB  11 TiB 24.33 1.13  51     up         osd.2                 
  3   hdd   14.59999  1.00000  15 TiB  2.9 TiB  2.8 TiB 9.3 MiB  9.1 GiB  12 TiB 19.58 0.91  48     up         osd.3                 
  4   hdd   14.59999  1.00000  15 TiB  3.3 TiB  3.3 TiB  11 MiB   11 GiB  11 TiB 22.94 1.06  51     up         osd.4                 
  5   hdd   14.59999  1.00000  15 TiB  3.5 TiB  3.5 TiB  12 MiB   12 GiB  11 TiB 23.94 1.11  50     up         osd.5                 
  6   hdd   14.59999  1.00000  15 TiB  2.8 TiB  2.8 TiB 9.6 MiB  9.6 GiB  12 TiB 19.11 0.89  49     up         osd.6                 
  7   hdd   14.59999  1.00000  15 TiB  3.4 TiB  3.4 TiB 4.9 MiB   11 GiB  11 TiB 23.68 1.10  50     up         osd.7                 
  8   hdd   14.59998  1.00000  15 TiB  3.2 TiB  3.2 TiB  10 MiB   10 GiB  11 TiB 22.18 1.03  51     up         osd.8                 
  9   hdd   14.59999  1.00000  15 TiB  3.4 TiB  3.4 TiB 4.9 MiB   11 GiB  11 TiB 23.52 1.09  50     up         osd.9                 
 10   hdd   14.59999  1.00000  15 TiB  2.7 TiB  2.6 TiB 8.5 MiB  8.5 GiB  12 TiB 18.25 0.85  50     up         osd.10                
 11   hdd   14.59999  1.00000  15 TiB  3.4 TiB  3.3 TiB  10 MiB   11 GiB  11 TiB 23.02 1.07  51     up         osd.11                
 12   hdd   14.59999  1.00000  15 TiB  2.8 TiB  2.8 TiB  10 MiB  9.7 GiB  12 TiB 19.53 0.91  49     up         osd.12                
 13   hdd   14.59999  1.00000  15 TiB  3.7 TiB  3.7 TiB  11 MiB   12 GiB  11 TiB 25.62 1.19  49     up         osd.13                
 14   hdd   14.59999  1.00000  15 TiB  2.6 TiB  2.6 TiB 8.2 MiB  8.3 GiB  12 TiB 17.65 0.82  53     up         osd.14                
 15   hdd   14.59999  1.00000  15 TiB  2.5 TiB  2.5 TiB 7.6 MiB  7.8 GiB  12 TiB 17.42 0.81  50     up         osd.15                
 16   hdd   14.59999  1.00000  15 TiB  3.5 TiB  3.5 TiB  11 MiB   11 GiB  11 TiB 24.37 1.13  50     up         osd.16                
 17   hdd   14.59999  1.00000  15 TiB  3.5 TiB  3.5 TiB  12 MiB   12 GiB  11 TiB 24.09 1.12  52     up         osd.17                
 18   hdd   14.59999  1.00000  15 TiB  2.4 TiB  2.4 TiB 6.9 MiB  7.5 GiB  12 TiB 16.79 0.78  49     up         osd.18                
 19   hdd   14.59999  1.00000  15 TiB  3.3 TiB  3.3 TiB 9.9 MiB   10 GiB  11 TiB 22.91 1.06  50     up         osd.19                
 20   hdd   14.59999  1.00000  15 TiB  3.6 TiB  3.6 TiB  12 MiB   12 GiB  11 TiB 25.02 1.16  49     up         osd.20                
 21   hdd   14.59999  1.00000  15 TiB  3.4 TiB  3.4 TiB  14 MiB   12 GiB  11 TiB 23.45 1.09  51     up         osd.21                
 22   hdd   14.59999  1.00000  15 TiB  3.3 TiB  3.3 TiB  12 MiB   11 GiB  11 TiB 22.64 1.05  51     up         osd.22                
 23   hdd   14.59999  1.00000  15 TiB  2.9 TiB  2.8 TiB 9.2 MiB  9.3 GiB  12 TiB 19.59 0.91  51     up         osd.23                
 24   hdd   14.59999  1.00000  15 TiB  3.4 TiB  3.3 TiB  12 MiB   11 GiB  11 TiB 23.04 1.07  50     up         osd.24                
 25   hdd   14.59999  1.00000  15 TiB  3.1 TiB  3.1 TiB  10 MiB  9.9 GiB  11 TiB 21.61 1.00  50     up         osd.25                
162   hdd   14.59999  1.00000  15 TiB  3.2 TiB  3.2 TiB  10 MiB   10 GiB  11 TiB 21.76 1.01  50     up         osd.162               
163   hdd   14.59999  1.00000  15 TiB  3.4 TiB  3.4 TiB  11 MiB   11 GiB  11 TiB 23.60 1.09  50     up         osd.163               
164   hdd   14.59999  1.00000  15 TiB  3.5 TiB  3.5 TiB  12 MiB   11 GiB  11 TiB 24.38 1.13  51     up         osd.164               
165   hdd   14.59999  1.00000  15 TiB  2.9 TiB  2.9 TiB 9.1 MiB  9.5 GiB  12 TiB 20.18 0.94  50     up         osd.165               
166   hdd   14.59999  1.00000  15 TiB  3.3 TiB  3.3 TiB  11 MiB   11 GiB  11 TiB 22.62 1.05  50     up         osd.166               
167   hdd   14.59999  1.00000  15 TiB  3.5 TiB  3.5 TiB  12 MiB   12 GiB  11 TiB 24.36 1.13  52     up         osd.167               

most of OSD settings are defaults, cache autotune, memory_target 4GB etc.

there is absolutely no activity on this (or any related) pool, just on one replicated, on different
drives, there are about 30MB/s writes. al lboxes are almost idle, have enough RAM. unfortunately
OSDs do not use any fast storage for WAL or any DB.

anyone met similar problem? Or somebody has hint on how to debug what are OSDs reading all the time?

I'd be very grateful

with best regards

nikola ciprich


-- 
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava

tel.:   +420 591 166 214
fax:    +420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: servis@xxxxxxxxxxx
-------------------------------------
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux