Re: OSD crashing - Corruption: block checksumo mismatch

Igor Fedotov <igor.fedotov@xxxxxxxx> · Thu, 2 Dec 2021 15:38:21 +0300

Hi Eneko,

I don't think this is a memory H/W issue. This reminds me the following 
thread: 
https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/DEOBAUXQBUFL6HNBBNJ3LMQUCQC76HLY/

There was apparently a data corruption in RocksDB which poped up during 
DB compaction only. There were no issues during regular access but 
sometimes internal auto-compaction procedure triggered the crash. 
Finally they disabled auto-compaction and took the data out of this OSD 
and redeployed it. You  might want to try the same approach if you need 
data from this OSD or just redeploy it.

Thanks,

Igor

On 12/2/2021 2:21 PM, Eneko Lacunza wrote:
Hi all,

Since we upgraded our tiny 4-node 15-osd from Nautilus to Pacific, we 
are seeing issues with osd.15, that periodically crashes with:

-10> 2021-12-02T11:52:50.716+0100 7f27071bc700 10 monclient: 
_check_auth_rotating have uptodate secrets (they expire after 
2021-12-02T11:52:20.721345+0100)
    -9> 2021-12-02T11:52:51.548+0100 7f2708efd700  5 prioritycache 
tune_memory target: 4294967296 mapped: 4041244672 unmapped: 479731712 
heap: 4520976384 old mem: 2845415818 new mem: 2845415818
    -8> 2021-12-02T11:52:51.696+0100 7f270b702700  3 rocksdb: 
[db_impl/db_impl_compaction_flush.cc:2807] Compaction error: 
Corruption: block checksum mismatch: expected 3428654824, got 
1987789945  in db/511261.s
st offset 7219720 size 4044
    -7> 2021-12-02T11:52:51.696+0100 7f270b702700  4 rocksdb: 
(Original Log Time 2021/12/02-11:52:51.701026) 
[compaction/compaction_job.cc:743] [default] compacted to: files[4 1 
21 0 0 0 0] max score 0.46, MB/se
c: 83.7 rd, 0.0 wr, level 1, files in(4, 1) out(1) MB in(44.2, 35.2) 
out(0.0), read-write-amplify(1.8) write-amplify(0.0) Corruption: block 
checksum mismatch: expected 3428654824, got 1987789945  in db/511261.ss
t offset 7219720 size 4044, records in: 613843, records dropped: 
288377 output_compression: NoCompression

    -6> 2021-12-02T11:52:51.696+0100 7f270b702700  4 rocksdb: 
(Original Log Time 2021/12/02-11:52:51.701047) EVENT_LOG_v1 
{"time_micros": 1638442371701036, "job": 1640, "event": 
"compaction_finished", "compactio
n_time_micros": 995261, "compaction_time_cpu_micros": 899466, 
"output_level": 1, "num_output_files": 1, "total_output_size": 
40875027, "num_input_records": 613843, "num_output_records": 325466, 
"num_subcompactio
ns": 1, "output_compression": "NoCompression", 
"num_single_delete_mismatches": 0, "num_single_delete_fallthrough": 0, 
"lsm_state": [4, 1, 21, 0, 0, 0, 0]}
    -5> 2021-12-02T11:52:51.696+0100 7f270b702700  2 rocksdb: 
[db_impl/db_impl_compaction_flush.cc:2341] Waiting after background 
compaction error: Corruption: block checksum mismatch: expected 
3428654824, got 1
987789945  in db/511261.sst offset 7219720 size 4044, Accumulated 
background error counts: 1
    -4> 2021-12-02T11:52:51.716+0100 7f27071bc700 10 monclient: tick
    -3> 2021-12-02T11:52:51.716+0100 7f27071bc700 10 monclient: 
_check_auth_rotating have uptodate secrets (they expire after 
2021-12-02T11:52:21.721429+0100)
    -2> 2021-12-02T11:52:51.788+0100 7f27046f4700 -1 rocksdb: 
submit_common error: Corruption: block checksum mismatch: expected 
3428654824, got 1987789945  in db/511261.sst offset 7219720 size 4044 
code = ^B Ro
cksdb transaction:
PutCF( prefix = m key = 
0x000000000000000700000000000008'^.0000042922.00000000000048814458' 
value size = 236)
PutCF( prefix = m key = 0x000000000000000700000000000008'^._fastinfo' 
value size = 186)
PutCF( prefix = O key = 
0x7F80000000000000074161CBB7'!rbd_data.1cf93c843df86a.000000000000021d!='0xFFFFFFFFFFFFFFFEFFFFFFFFFFFFFFFF6F0026000078 
value size = 535)
PutCF( prefix = O key = 
0x7F80000000000000074161CBB7'!rbd_data.1cf93c843df86a.000000000000021d!='0xFFFFFFFFFFFFFFFEFFFFFFFFFFFFFFFF6F 
value size = 420)
PutCF( prefix = L key = 0x0000000000C33283 value size = 4135)
    -1> 2021-12-02T11:52:51.800+0100 7f27046f4700 -1 
./src/os/bluestore/BlueStore.cc: In function 'void 
BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)' thread 
7f27046f4700 time 2021-12-02T11:52:51.7937
84+0100
./src/os/bluestore/BlueStore.cc: 11650: FAILED ceph_assert(r == 0)

 ceph version 16.2.6 (1a6b9a05546f335eeeddb460fdc89caadf80ac7a) 
pacific (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x124) [0x55fe1a8e992e]
 2: /usr/bin/ceph-osd(+0xabaab9) [0x55fe1a8e9ab9]
 3: (BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)+0x5ff) 
[0x55fe1aefd50f]
 4: (BlueStore::_kv_sync_thread()+0x1a23) [0x55fe1af3b3d3]
 5: (BlueStore::KVSyncThread::entry()+0xd) [0x55fe1af6492d]
 6: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8ea7) [0x7f2716046ea7]
 7: clone()

ceph version 16.2.6 (1a6b9a05546f335eeeddb460fdc89caadf80ac7a) pacific 
(stable)
 1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x14140) [0x7f2716052140]
 2: gsignal()
 3: abort()
 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x16e) [0x55fe1a8e9978]
 5: /usr/bin/ceph-osd(+0xabaab9) [0x55fe1a8e9ab9]
 6: (BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)+0x5ff) 
[0x55fe1aefd50f]
 7: (BlueStore::_kv_sync_thread()+0x1a23) [0x55fe1af3b3d3]
 8: (BlueStore::KVSyncThread::entry()+0xd) [0x55fe1af6492d]
 9: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8ea7) [0x7f2716046ea7]
 10: clone()
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is 
needed to interpret this.

This is a Proxmox VE HC cluster.

Node has 3 other OSDs, filestore and HDD. osd.15 is SSD and bluestore. 
All nodes have one SSD/bluestore OSD and 2-3 HDD OSDs (some filestore 
and some bluestore).

osd.15 restarts gracefully after the crash and continues working OK 
for days or even 1-2 weeks.

We suspect some kind of (memory?) corruption or SSD malfunction on the 
node; maybe other data is being corrupted and we don't know that 
because other OSDs are filestore.

Problem happening after upgrade is suspicious, but could be a 
coincidence...

Is there any way I could make some kind of "fsck" for that osd.15, so 
that I can know it is good in a given moment? Any other suggestion to 
troubleshoot the issue? (otherwise we'll be changing RAM modules to 
see if that helps...)

Thanks a lot

Eneko Lacunza
Zuzendari teknikoa | Director técnico
Binovo IT Human Project

Tel. +34 943 569 206 | https://www.binovo.es
Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun

https://www.youtube.com/user/CANALBINOVO
https://www.linkedin.com/company/37269706/

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

--
Igor Fedotov
Ceph Lead Developer

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx