Re: 3 OSDs can not be started after a server reboot - rocksdb Corruption

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Igor,

today (21-02-2022) at 13:49:28.452+0100, I crashed the OSD 7 again. And this time I have logs with “debug bluefs = 20” and "debug bdev = 20” for every OSD in the cluster! It was the OSD with the ID 7 again. So the HDD has failed now the third time! Coincidence? Probably not…
The important thing seams to be that a shutdown and not only a restart of the entire cluster is performed. Since, this time the OSD failed after just 4 shutdowns of all nodes in the cluster within 70 minutes.

I redeployed the OSD.7 after the crash from 2 days ago. And I started this new shutdown and boot series shortly after ceph had finished writing everything back to OSD.7, earlier today. 

The corrupted RocksDB file (crash) is again only 2KB in size.
You can download the RocksDB file with the bad  table magic number and the log of the OSD.7 under this link: https://we.tl/t-e0NqjpSmaQ
What else do you want?


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux