Hi,
after upgrading the cluster from 16.2.5 to 16.2.6, several OSDs crashed
and refuse to start due to rocksdb corruption, eg:
--------
2021-09-19T15:47:10.611+0200 7f8bc1f0e700 4 rocksdb:
[compaction/compaction_job.cc:1680] [default] Compaction start summary:
Base version 6 Base level 0, inputs: [251944(53MB) 251942(42MB)
251940(33MB)], [251935(66MB) 251936(66MB) 251937(4464KB) 251938(8217KB)]
2021-09-19T15:47:10.611+0200 7f8bc1f0e700 4 rocksdb: EVENT_LOG_v1
{"time_micros": 1632059230612093, "job": 13, "event":
"compaction_started", "compaction_reason": "LevelL0FilesNum",
"files_L0": [251944, 251942, 251940], "files_L1": [251935, 251936,
251937, 251938], "score": 1.27373, "input_data_size": 287841071}
2021-09-19T15:47:13.610+0200 7f8bc1f0e700 3 rocksdb:
[db_impl/db_impl_compaction_flush.cc:2808] Compaction error: Corruption:
block checksum mismatch: expected 2427092066, got 4051549320 in
db/251935.sst offset 18414386 size 4032
2021-09-19T15:47:13.610+0200 7f8bc1f0e700 4 rocksdb: (Original Log Time
2021/09/19-15:47:13.611350) [compaction/compaction_job.cc:760] [default]
compacted to: files[3 4 31 138 0 0 0] max score 0.97, MB/sec: 96.0 rd,
0.0 wr, level 1, files in(3, 4) out(1) MB in(130.0, 144.5) out(0.0),
read-write-amplify(2.1) write-amplify(0.0) Corruption: block checksum
mismatch: expected 2427092066, got 4051549320 in db/251935.sst offset
18414386 size 4032, records in: 1654508, records dropped: 1554257
output_compression: NoCompression
2021-09-19T15:47:13.610+0200 7f8bc1f0e700 4 rocksdb: (Original Log Time
2021/09/19-15:47:13.611381) EVENT_LOG_v1 {"time_micros":
1632059233611365, "job": 13, "event": "compaction_finished",
"compaction_time_micros": 2999230, "compaction_time_cpu_micros": 87965,
"output_level": 1, "num_output_files": 1, "total_output_size": 25072635,
"num_input_records": 1654508, "num_output_records": 100251,
"num_subcompactions": 1, "output_compression": "NoCompression",
"num_single_delete_mismatches": 0, "num_single_delete_fallthrough": 0,
"lsm_state": [3, 4, 31, 138, 0, 0, 0]}
2021-09-19T15:47:13.610+0200 7f8bc1f0e700 2 rocksdb:
[db_impl/db_impl_compaction_flush.cc:2344] Waiting after background
compaction error: Corruption: block checksum mismatch: expected
2427092066, got 4051549320 in db/251935.sst offset 18414386 size 4032,
Accumulated background error counts: 1
2021-09-19T15:47:13.636+0200 7f8bbacf1700 -1 rocksdb: submit_common
error: Corruption: block checksum mismatch: expected 2427092066, got
4051549320 in db/251935.sst offset 18414386 size 4032 code = 2 Rocksdb
transaction:
PutCF( prefix = O key =
0x8C7FFFFFFFFFFFFFF2EB980CC3'!temp_recovering_12.9d7s12_71578''447639_77916_head!='0xFFFFFFFFFFFFFFFEFFFFFFFFFFFFFFFF6F000C000078
value size = 905)
PutCF( prefix = O key =
0x8C7FFFFFFFFFFFFFF2EB980CC3'!temp_recovering_12.9d7s12_71578''447639_77916_head!='0xFFFFFFFFFFFFFFFEFFFFFFFFFFFFFFFF6F
value size = 432)
MergeCF( prefix = b key = 0x00000024A4000000 value size = 16)
MergeCF( prefix = T key = 0x000000000000000C value size = 40)
2021-09-19T15:47:13.638+0200 7f8bbacf1700 -1
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.6/rpm/el8/BUILD/ceph-16.2.6/src/os/bluestore/BlueStore.cc:
In function 'void BlueStore::_txc_apply_kv(BlueStore::TransContext*,
bool)' thread 7f8bbacf1700 time 2021-09-19T15:47:13.637926+0200
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.6/rpm/el8/BUILD/ceph-16.2.6/src/os/bluestore/BlueStore.cc:
11650: :AILED ceph_assert(r == 0)
ceph version 16.2.6 (ee28fb57e47e9f88813e24bbf4c14496ca299d31) pacific
(stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x158) [0x56045e16a54c]
2: /usr/bin/ceph-osd(+0x56a766) [0x56045e16a766]
3: (BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)+0x45f)
[0x56045e79639f]
4: (BlueStore::_kv_sync_thread()+0x16dc) [0x56045e7cfa0c]
5: (BlueStore::KVSyncThread::entry()+0x11) [0x56045e7f82d1]
6: /lib64/libpthread.so.0(+0x814a) [0x7f8bd06f414a]
7: clone()
2021-09-19T15:47:13.640+0200 7f8bbacf1700 -1 *** Caught signal (Aborted) **
in thread 7f8bbacf1700 thread_name:bstore_kv_sync
ceph version 16.2.6 (ee28fb57e47e9f88813e24bbf4c14496ca299d31) pacific
(stable)
1: /lib64/libpthread.so.0(+0x12b20) [0x7f8bd06feb20]
2: gsignal()
3: abort()
4: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x1a9) [0x56045e16a59d]
5: /usr/bin/ceph-osd(+0x56a766) [0x56045e16a766]
6: (BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)+0x45f)
[0x56045e79639f]
7: (BlueStore::_kv_sync_thread()+0x16dc) [0x56045e7cfa0c]
8: (BlueStore::KVSyncThread::entry()+0x11) [0x56045e7f82d1]
9: /lib64/libpthread.so.0(+0x814a) [0x7f8bd06f414a]
10: clone()
NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
---------
I have attached a part of the osd log.
7 out of ~1600 OSDs have this issue. bluestore fsck or repair does not
help, it actually crashes on most of them. The cluster was stable of 6
weeks before, no daemon crashes.
Any hints? I upgraded a smaller cluster before with 350 OSDs and none
had issues.
config, just in case. I have disabled bluefs_buffered_io after few
crashes appeared.
# ceph config dump
WHO MASK LEVEL OPTION
VALUE RO
global advanced objecter_inflight_op_bytes 1073741824
global advanced osd_pool_default_pg_autoscale_mode off
mon advanced auth_allow_insecure_global_id_reclaim false
mon advanced mon_allow_pool_delete true
mon advanced mon_max_pg_per_osd 1000
mgr advanced mgr/prometheus/rbd_stats_pools
rbd,rbd_data,proxmox,rbd_fastdata,proxmox_fast *
mgr advanced osd_deep_scrub_interval 1209600.000000
mgr basic target_max_misplaced_ratio 0.800000
osd advanced bluefs_buffered_io false
osd advanced objecter_inflight_ops 10240
osd advanced osd_deep_scrub_interval 1209600.000000
osd advanced osd_max_backfills 32
osd advanced osd_max_pg_per_osd_hard_ratio 20.000000
osd advanced osd_max_scrubs 8
osd advanced osd_op_num_threads_per_shard_ssd
4 *
osd advanced osd_op_thread_timeout 90
osd advanced osd_recovery_max_active 10
osd advanced osd_recovery_op_priority 63
osd advanced osd_recovery_sleep_hdd 0.000000
osd advanced osd_scrub_auto_repair true
mds basic mds_cache_memory_limit 17179869184
mds advanced mds_cache_trim_threshold 262144
mds advanced mds_recall_global_max_decay_threshold 131072
mds advanced mds_recall_max_caps 30000
mds advanced mds_recall_max_decay_rate 1.500000
mds advanced mds_recall_max_decay_threshold 131072
mds advanced mds_recall_warning_threshold 262144
client advanced client_force_lazyio true
Best regards,
Andrej
--
_____________________________________________________________
prof. dr. Andrej Filipcic, E-mail: Andrej.Filipcic@xxxxxx
Department of Experimental High Energy Physics - F9
Jozef Stefan Institute, Jamova 39, P.o.Box 3000
SI-1001 Ljubljana, Slovenia
Tel.: +386-1-477-3674 Fax: +386-1-425-7074
-------------------------------------------------------------
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx