Re: bluefs enospc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Derek,

first of all  some BlueStore design overview to make sure we're on the same plate.

BlueFS doesn't keep all the BlueStore data but just RocksDB part of it. In your case BlueFS shares the same device with BlueStore user data.

Some space rebalance procedure takes periodically place to make sure BlueFS has enough space to keep DB's data.

Hence there is a primary BlueStore space allocator which tracks the whole volume space. And there is BlueFS one which is gifted by primary allocator with some space depending on its needs.


Some observation on your case:

1) bluefs-bdev-sizes reports total device space and usage for BlueFS (!) part of it. I.e. 22GiB are for BlueFS only, it provides no insight about overall space usage.

2) Looks like Bluestore allocator complains about lack of free space. Which means BlueFS plus user data took all the space. See:

-14> 2020-03-15 14:43:47.572 7f32925dd700 -1 bluestore(/var/lib/ceph/osd/ceph-681) _do_alloc_write failed to allocate 0x400000 allocated 0x 3ac000 min_alloc_size 0x4000 available 0x 0 -13> 2020-03-15 14:43:47.572 7f32925dd700 -1 bluestore(/var/lib/ceph/osd/ceph-681) _do_write _do_alloc_write failed with (28) No space left on device -12> 2020-03-15 14:43:47.572 7f32925dd700 -1 bluestore(/var/lib/ceph/osd/ceph-681) _txc_add_transaction error (28) No space left on device not handled on operation 10 (op 4, counting from 0) -11> 2020-03-15 14:43:47.572 7f32925dd700 -1 bluestore(/var/lib/ceph/osd/ceph-681) ENOSPC from bluestore, misconfigured cluster

3) repair suffers from both lack of space for both allocators. BlueFS one tries to acquire some additional space from the primary allocator which fails to do that:

2020-03-15 23:55:14.816 7f0d3fac2c00 -1 bluestore(/var/lib/ceph/osd/ceph-709) allocate_bluefs_freespace failed to allocate on 0xb000000 min_size 0xb000000 > allocated total 0x80000 bluefs_shared_alloc_size 0x10000 allocated 0x80000 available 0x 8000 2020-03-15 23:55:14.816 7f0d3fac2c00 -1 bluefs _allocate failed to expand slow device to fit +0xaffa895 2020-03-15 23:55:14.816 7f0d3fac2c00 -1 bluefs _flush_range allocated: 0x0 offset: 0x0 length: 0xaffa895

OSD-709 has been already expanded, right?

What's the error reported by fsck?


4) OSD.681 has a number of checksum verification errors when reading DB data:

2020-03-15 14:03:52.890 7f6311ffa700  3 rocksdb: [table/block_based_table_reader.cc:1117] Encountered error while reading data from compression dictionary block Corruption: block checksum mismatch: expected 0, got 2324967102  in db/012948.sst offset 18446744073709551615 size 18446744073709551615

Can't say if this is bound to space shortage or not. Wondering if other OSDs reported(-ing) something similar?


Thanks,

Igor


On 3/16/2020 7:15 AM, Derek Yarnell wrote:
Hi,

We have a production cluster that just suffered an issue with multiple
of our NVMe OSDs.  Multiple of them died (>12) with errors that they no
longer had space with a 'ENOSPC from bluestore, misconfigured cluster'
error over 4 nodes.  These are all simple one device bluestore osds.

ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) nautilus
(stable)

This is an example[0] of one of the logs.  In this case each of 8 NVMe
OSDs on a node have 106GB of space allocated to each bluestore NVMe OSD.
  The ceph-bluestore-tool bluefs-bdev-sizes output only lists 22GiB for
osd 681.  I extended the space of bluestore on a few of the OSDs via LVM
and then the bluefs-bdev-expand command.  This worked for a few and not
for others.

Some of the ones that it did work for recovered for a bit then
re-entered the error state.  Trying to extend the allocation didn't work
after that.  When they failed again I ran the fsck which reported that
it found 1 error and then running repair I got a rather long stack trace[1].

# ceph-bluestore-tool --log-level 30 --command bluefs-bdev-sizes --path
/var/lib/ceph/osd/ceph-681
inferring bluefs devices from bluestore path
  slot 1 /var/lib/ceph/osd/ceph-681/block -> /dev/dm-33
1 : device size 0x1a80000000 : own
0x[2480000~10000,24a0000~10000,2520000~60000,25f0000~c0000,2720000~50000,28a0000~110000,2a20000~230000,2cc0000~260000,2f30000~220000,31c0000~6b0000,38a0000~10000,3990000~3e0000,3d80000~530000,42d0000~590000,48d0000~400000,4d00000~7d0000,54f0000~c50000,6150000~10000,6190000~150000,6350000~c0000,6480000~160000,6640000~1e0000,6870000~c0000,6a00000~30000,6a40000~240000,6dd0000~310000,7210000~b0000,73a0000~b0000,76a0000~180000,7830000~80000,78e0000~240000,7b70000~90000,7c50000~b0000,7ef0000~140000,8040000~30000,8180000~250000,8440000~50000,84b0000~110000,8610000~c0000,9e20000~20000,9e50000~b0000,9f10000~60000,9f80000~30000,dd80000~180000,df70000~6a0000,e620000~5ae0000,15200000~3510000,187f0000~bf0000,19490000~1070000,1ab70000~4c0000,1b400000~7d0000,1bbe0000~c20000,1cd10000~340000,1d3a0000~860000,1dd00000~2e00000,20c00000~3f00000,24d00000~700000,25600000~700000,26100000~200000,26400000~300000,26b00000~600000,27400000~400000,27ba0000~6e0000,28500000~1d00000,2a400000~700000,2ac00000~100000,2
  b100000~300000,2b470000~120000,2b700000~500000,2c000000~200000,2c400000~400000,2ca00000~100000,2cf00000~300000,2d340000~39b0000,30d00000~1f00000,32e00000~4bf0000,380a0000~3c0000,38500000~c0000,38bd0000~400000,390b0000~340000,39400000~100000,39900000~1000000,3ac00000~5d00000,40b90000~400000,41280000~db50000,4ee00000~700000,4f900000~4500000,54390000~100000,54e00000~18400000,6d800000~20d0000,6f8f0000~1a10000,71400000~4500000,76100000~300000,766e0000~6860000,7dd00000~c00000,7eac0000~a0000,7ef90000~f190000,8e1f0000~80000,8e410000~60000,8e480000~20000,8e4b0000~20000,8e5c0000~50000,8e7e0000~50000,8f160000~60000,8f240000~a0000,90000000~15e90000,a6200000~c3a0000,b25d0000~630000,b3000000~c00000,b3ee0000~90000,b4200000~d00000,b5a70000~160000,b63f0000~2a0000,b6720000~2820000,bab00000~400000,bbf60000~10ad0000,ccb90000~2300000,cf000000~2b00000,d1ca0000~10000,d1e00000~1400000,d3230000~1df0000,d5200000~1a00000,d6d00000~800000,d75e0000~6f0000,d7f00000~d00000,d9100000~400000,d9900000~d00000,da800000~
  600000,daf10000~400000,db700000~1600000,dd280000~20000,dd670000~390000,dda30000~400000,de190000~70000,de2a0000~370000,de660000~20000,de700000~14770000,f3600000~700000,f3db0000~960000,f49e0000~5b00000,fa600000~c00000,fb300000~510000,fbb00000~100000,fbeb0000~450000,fc400000~2b0000,fd400000~400000,fde00000~c00000,ff0b0000~50000,ff200000~800000,ffd60000~10000,fff00000~a0000,100200000~300000,101600000~100000,101750000~300000,102120000~1e0000,1027f0000~a00000,103600000~330000,103b00000~200000,103e60000~4a0000,104310000~c00000,105030000~1200000,106800000~100000,106b20000~400000,107000000~300000,1073e0000~400000,107950000~86b0000,110140000~d0000,110350000~2e0000,110e20000~20000,110eb0000~a0000,110f60000~60000,110fd0000~1f0000,1112a0000~f0000,111420000~30000,1115b0000~30000,111620000~150000,111790000~40000,112560000~180000,112730000~180000,1129b0000~50000,112f90000~4a0000,113840000~c0000,113ea0000~40000,113fb0000~130000,114100000~310000,114470000~10000,114620000~120000,114810000~120000,114a0
  0000~20000,114a90000~f0000,114c60000~e0000,114e80000~20000,114f70000~140000,1150c0000~50000,1151f0000~320000,1155f0000~10000,115670000~226f0000,137e30000~800000,138b50000~400000,139400000~1500000,13ae00000~4500000,13f480000~400000,13f950000~1b6e0000,15b700000~400000,15c000000~300000,15c600000~700000,15d9e0000~1820000,15f400000~c00000,160400000~d00000,1613a0000~630000,1619e0000~d20000,162800000~1b00000,164600000~7550000,170d00000~1800000,172580000~100000,172d70000~1c190000,18f100000~40e0000,193700000~400000,193c90000~6970000,19aa90000~188d0000,1b3770000~c220000,1bff00000~1200000,1c12f0000~400000,1c2c00000~400000,1c3f80000~400000,1c5300000~4200000,1c95c0000~b9a40000,283800000~1800000,289000000~3800000,28d000000~6000000,293800000~2000000,296000000~e000000,2a4800000~465d0000,4b6100000~4c560000,502800000~32500000,cb8500000~10f600000,1139800000~7a000000,1668d30000~3d290000,1794000000~3e800000,191c000000~3ee00000,1a49800000~4110000,1a4f800000~40f0000,1a70800000~4100000,1a76000000~3c10000]
= 0x582550000 : using 0x56d090000(22 GiB)

Any help here would be appreciated, I have stopped out CephFS file
system but our radosgw is also impacted.

[0] - ftp://ftp.umiacs.umd.edu/pub/derek/ceph-osd.681.log
[1] - ftp://ftp.umiacs.umd.edu/pub/derek/ceph-osd.709.repair

Thanks,
derek

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux