Re: BlueFS spillover detected - 14.2.1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Brett,

this issue has been with you long before upgrade to 14.2.1. This upgrade just brought corresponding alert visible.

You can turn the alert off by setting bluestore_warn_on_bluefs_spillover=false.

But generally this warning shows DB data layout inefficiency - some data is kept at slow device - which might has some negative performance impact.

Unfortunately that's a know issue with current RocksDB/BlueStore interaction - spillovers to slow device might take place even when there is plenty of free space at fast one.


Thanks,

Igor



On 6/18/2019 8:46 PM, Brett Chancellor wrote:
Does anybody have a fix for BlueFS spillover detected? This started happening 2 days after an upgrade to 14.2.1 and has increased from 3 OSDs to 118 in the last 4 days.  I read you could fix it by rebuilding the OSDs, but rebuilding the 264 OSDs on this cluster will take months of rebalancing.

$ sudo ceph health detail
HEALTH_WARN BlueFS spillover detected on 118 OSD(s)
BLUEFS_SPILLOVER BlueFS spillover detected on 118 OSD(s)
     osd.0 spilled over 22 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.1 spilled over 103 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.5 spilled over 21 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.6 spilled over 64 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.11 spilled over 22 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.13 spilled over 23 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device
     osd.21 spilled over 102 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.22 spilled over 103 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.23 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.24 spilled over 25 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.25 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.26 spilled over 64 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.27 spilled over 21 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.30 spilled over 65 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.32 spilled over 21 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device
     osd.34 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.42 spilled over 25 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.45 spilled over 103 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.46 spilled over 24 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device
     osd.47 spilled over 63 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.48 spilled over 63 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.49 spilled over 62 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.50 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.52 spilled over 140 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.53 spilled over 22 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.54 spilled over 59 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device
     osd.55 spilled over 134 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.56 spilled over 19 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.57 spilled over 61 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.58 spilled over 66 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.59 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.61 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.62 spilled over 59 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.65 spilled over 19 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.67 spilled over 62 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.69 spilled over 20 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.71 spilled over 21 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.73 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.74 spilled over 17 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.75 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.76 spilled over 22 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.78 spilled over 64 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.80 spilled over 100 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.81 spilled over 63 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.82 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.83 spilled over 60 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.84 spilled over 60 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.85 spilled over 19 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device
     osd.87 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.89 spilled over 21 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.93 spilled over 102 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.95 spilled over 62 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.98 spilled over 19 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.101 spilled over 20 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.103 spilled over 60 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.108 spilled over 20 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.110 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.112 spilled over 65 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device
     osd.113 spilled over 64 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.115 spilled over 21 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.117 spilled over 23 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device
     osd.118 spilled over 22 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.119 spilled over 62 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.120 spilled over 65 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.121 spilled over 101 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.122 spilled over 21 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.126 spilled over 62 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.127 spilled over 62 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.128 spilled over 23 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device
     osd.129 spilled over 67 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.132 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.133 spilled over 23 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.137 spilled over 23 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.138 spilled over 22 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.139 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.142 spilled over 22 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.143 spilled over 27 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.144 spilled over 25 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.147 spilled over 23 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.148 spilled over 96 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.157 spilled over 62 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.158 spilled over 64 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.160 spilled over 22 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.161 spilled over 61 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.167 spilled over 23 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.177 spilled over 23 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device
     osd.180 spilled over 140 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.185 spilled over 20 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.189 spilled over 62 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.190 spilled over 22 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.192 spilled over 19 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.193 spilled over 21 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.202 spilled over 21 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.207 spilled over 27 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device
     osd.216 spilled over 60 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.219 spilled over 59 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.220 spilled over 22 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.221 spilled over 176 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.223 spilled over 60 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.225 spilled over 22 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device
     osd.226 spilled over 23 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.228 spilled over 59 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device
     osd.236 spilled over 65 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.237 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.238 spilled over 23 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.239 spilled over 103 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.240 spilled over 25 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.241 spilled over 26 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.242 spilled over 60 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.243 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.244 spilled over 103 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.245 spilled over 144 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device
     osd.246 spilled over 25 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.247 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.251 spilled over 106 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.252 spilled over 105 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.261 spilled over 20 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device
     osd.262 spilled over 22 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux