This may sound counter-intuitive, but you need to disable write cache to enable PLP cache only. SSDs with PLP have usually 2 types of cache, volatile and non-volatile. The volatile cache will experience data loss on power loss. It is the volatile cache that gets disabled when issuing the hd-/sdparm/smartctl command to switch it off. In many cases this can increase the non-volatile cache and also performance. It is the non-volatile cache you want your writes to go to directly. Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: huxiaoyu@xxxxxxxxxxxx <huxiaoyu@xxxxxxxxxxxx> Sent: 26 November 2021 22:41:10 To: YiteGu; ceph-users Subject: Re: Rocksdb: Corruption: missing start of fragmented record(1) wal/db are on Intel S4610 960GB SSDs, with PLP and write back on huxiaoyu@xxxxxxxxxxxx From: YiteGu Date: 2021-11-26 11:32 To: huxiaoyu@xxxxxxxxxxxx; ceph-users Subject: Re: Rocksdb: Corruption: missing start of fragmented record(1) It look like your wal/db device loss data. please check your wal/db device whether have writeback cache, and power loss cause data loss. replay log failure when rocksdb restart. YiteGu ess_gyt@xxxxxx ------------------ Original ------------------ From: "huxiaoyu@xxxxxxxxxxxx" <huxiaoyu@xxxxxxxxxxxx>; Date: Fri, Nov 26, 2021 06:02 PM To: "ceph-users"<ceph-users@xxxxxxx>; Subject: Rocksdb: Corruption: missing start of fragmented record(1) Dear Cephers, I just had one Ceph osd node (Luminous 12.2.13) power-loss unexpectedly, and after restarting that node, two OSDs out of 10 can not be started, issuing the following errors (see below image), in particular, i see Rocksdb: Corruption: missing start of fragmented record(1) Bluestore(/var/lib/ceph/osd/osd-21) _open_db erroring opening db: ... **ERROR: OSD init failed: (5) Input/output error I checked the db/val SSDs, and they are working fine. So I am wondering the following 1) Is there a method to restore the OSDs? 2) what could be the potential causes of the corrupted db/wal? The db/wal SSDs have PLP and not been damaged during the power loss Your help would be highly appreciated. best regards, samuel huxiaoyu@xxxxxxxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx