Re: Rocksdb: Corruption: missing start of fragmented record(1)

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Mon, 29 Nov 2021 09:24:29 +0100

Hi Frank,

That's true from the performance perspective, however it is not unsafe
to leave the cache enabled -- ceph uses fsync appropriately to make
the writes durable.

This issue looks rather to be related to concurrent hardware failure.

Cheers, Dan

On Mon, Nov 29, 2021 at 9:21 AM Frank Schilder <frans@xxxxxx> wrote:
>
> This may sound counter-intuitive, but you need to disable write cache to enable PLP cache only. SSDs with PLP have usually 2 types of cache, volatile and non-volatile. The volatile cache will experience data loss on power loss. It is the volatile cache that gets disabled when issuing the hd-/sdparm/smartctl command to switch it off. In many cases this can increase the non-volatile cache and also performance.
>
> It is the non-volatile cache you want your writes to go to directly.
>
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: huxiaoyu@xxxxxxxxxxxx <huxiaoyu@xxxxxxxxxxxx>
> Sent: 26 November 2021 22:41:10
> To: YiteGu; ceph-users
> Subject:  Re: Rocksdb: Corruption: missing start of fragmented record(1)
>
> wal/db are on Intel S4610 960GB SSDs, with PLP and write back on
>
>
>
> huxiaoyu@xxxxxxxxxxxx
>
> From: YiteGu
> Date: 2021-11-26 11:32
> To: huxiaoyu@xxxxxxxxxxxx; ceph-users
> Subject: Re: Rocksdb: Corruption: missing start of fragmented record(1)
> It look like your wal/db device loss data.
> please check your wal/db device whether have writeback cache, and power loss cause data loss. replay log failure when rocksdb restart.
>
>
>
> YiteGu
> ess_gyt@xxxxxx
>
>
>
> ------------------ Original ------------------
> From: "huxiaoyu@xxxxxxxxxxxx" <huxiaoyu@xxxxxxxxxxxx>;
> Date: Fri, Nov 26, 2021 06:02 PM
> To: "ceph-users"<ceph-users@xxxxxxx>;
> Subject:  Rocksdb: Corruption: missing start of fragmented record(1)
>
> Dear Cephers,
>
> I just had one Ceph osd node (Luminous 12.2.13) power-loss unexpectedly, and after restarting that node, two OSDs out of 10 can not be started, issuing the following errors (see below image), in particular, i see
>
> Rocksdb: Corruption: missing start of fragmented record(1)
> Bluestore(/var/lib/ceph/osd/osd-21) _open_db erroring opening db:
> ...
> **ERROR: OSD init failed: (5)  Input/output error
>
> I checked the db/val SSDs, and they are working fine. So I am wondering the following
> 1) Is there a method to restore the OSDs?
> 2) what could be the potential causes of the corrupted db/wal? The db/wal SSDs have PLP and not been damaged during the power loss
>
> Your help would be highly appreciated.
>
> best regards,
>
> samuel
>
>
>
>
> huxiaoyu@xxxxxxxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx