Re: How to recover from corrupted RocksDb

Wido den Hollander <wido@xxxxxxxx> · Thu, 29 Nov 2018 10:42:36 +0100

On 11/29/18 10:28 AM, Mario Giammarco wrote:
> Hello,
> I have a ceph installation in a proxmox cluster.
> Due to a temporary hardware glitch now I get this error on osd startup
> 
>     -6> 2018-11-26 18:02:33.179327 7fa1d784be00  0 osd.0 1033 crush map
>     has features 1009089991638532096, adjusting msgr requires for osds 
>        -5> 2018-11-26 18:02:34.143084 7fa1c33f9700  3 rocksdb:
>     [/build/ceph-12.2.9/src/rocksdb/db/db_impl_compaction_flush.cc:1591]
>     Compaction error: Corruption: block checksum mismatch 
>     -4> 2018-11-26 18:02:34.143123 7fa1c33f9700 4 rocksdb: (Original Log
>     Time 2018/11/26-18:02:34.143021)
>     [/build/ceph-12.2.9/src/rocksdb/db/compaction_job.cc:621] [default]
>     compacted to: base level 1 max bytes base268435456 files[17$ 
>                         
>     -3> 2018-11-26 18:02:34.143126 7fa1c33f9700 4 rocksdb: (Original Log
>     Time 2018/11/26-18:02:34.143068) EVENT_LOG_v1 {"time_micros":
>     1543251754143044, "job": 3, "event": "compaction_finished",
>     "compaction_time_micros": 1997048, "out$ 
>        -2> 2018-11-26 18:02:34.143152 7fa1c33f9700  2 rocksdb:
>     [/build/ceph-12.2.9/src/rocksdb/db/db_impl_compaction_flush.cc:1275]
>     Waiting after background compaction error: Corruption: block
>     checksum mismatch, Accumulated background err$ 
>        -1> 2018-11-26 18:02:34.674171 7fa1c4bfc700 -1 rocksdb:
>     submit_transaction error: Corruption: block checksum mismatch code =
>     2 Rocksdb transaction: 
>     Delete( Prefix = O key =
>     0x7f7ffffffffffffffb64000000217363'rub_3.26!='0xfffffffffffffffeffffffffffffffff'o') 
>     Put( Prefix = S key = 'nid_max' Value size = 8) 
>     Put( Prefix = S key = 'blobid_max' Value size = 8) 
>         0> 2018-11-26 18:02:34.675641 7fa1c4bfc700 -1
>     /build/ceph-12.2.9/src/os/bluestore/BlueStore.cc: In function 'void
>     BlueStore::_kv_sync_thread()' thread 7fa1c4bfc700 time 2018-11-26
>     18:02:34.674193 
>     /build/ceph-12.2.9/src/os/bluestore/BlueStore.cc: 8717: FAILED
>     assert(r == 0) 
>                       
>     ceph version 12.2.9 (9e300932ef8a8916fb3fda78c58691a6ab0f4217)
>     luminous (stable) 
>     1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>     const*)+0x102) [0x55ec83876092] 
>     2: (BlueStore::_kv_sync_thread()+0x24b5) [0x55ec836ffb55] 
>     3: (BlueStore::KVSyncThread::entry()+0xd) [0x55ec8374040d] 
>     4: (()+0x7494) [0x7fa1d5027494] 
>     5: (clone()+0x3f) [0x7fa1d4098acf]
> 
> 
> I have tried to recover it using ceph-bluestore-tool fsck and repair
> DEEP but it says it is ALL ok.
> I see that rocksd ldb tool needs .db files to recover and not a
> partition so I cannot use it.
> I do not understand why I cannot start osd if ceph-bluestore-tools says
> me I have lost no data.
> Can you help me?

Why would you try to recover a individual OSD? If all your Placement
Groups are active(+clean) just wipe the OSD and re-deploy it.

What's the status of your PGs?

It says there is a checksum error (probably due to the hardware glitch)
so it refuses to start.

Don't try to outsmart Ceph, let backfill/recovery handle this. Trying to
manually fix this will only make things worse.

Wido

> Thanks,
> Mario
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com