Re: OSDs unable to mount BlueFS after reboot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 16, 2021 at 08:17:48AM +0200, Stefan Kooman wrote:
> On 9/16/21 00:09, Davíð Steinn Geirsson wrote:
> 
> > > You might get more information with increasing debug for rocksdb / bluefs /
> > > bluestore
> > > 
> > > ceph config set osd.0 debug_rocksdb = 20/20
> > > ceph config set osd.0 debug_bluefs = 20/20
> > > ceph config set osd.0 debug_bluestore = 20/20
> > 
> > These debug tuneables give a lot more output and strongly support this being
> > corrupt RocksDB. The whole OSD output is quite large so I put only the last
> > part, from the BlueFS journal replay onwards:
> > https://paste.debian.net/1211916/
> > 
> > I am concerned by this incident, as I know I brought the machine down
> > cleanly and logs suggest all OSDs were gracefully terminated. Also none
> > of the drives are reporting any issues with uncorrectable sectors
> > (though I know not to put too much stock in drive error reporting). In
> > this case there is sufficient redundancy to correct everything, but if the
> > same also happened on other hosts that would not be the case. I'll put
> > the affected drives under a microscope, keep the OSD around for research
> > just in case, and keep digging hoping to find some explanation.
> 
> Yeah, these kind of incidents suck. You think your data is safe, and than
> all of a sudden it does not seem to be. Pretty weird. What kind of drives do
> you use? I wonder if it can be a cache somewhere that didn't get flushed (in
> time).

The 4 affected drives are of 3 different types from 2 different vendors:
ST16000NM001G-2KK103
ST12000VN0007-2GS116
WD60EFRX-68MYMN1

They are all connected through an LSI2308 SAS controller in IT mode. Other
drives that did not fail are also connected to the same controller.

There are no expanders in this particular machine, only a direct-attach
SAS backplane.

> 
> > 
> > Thank you very much for your assistance Stefan, it's helped me a lot
> > getting to know the debugging features of ceph better.
> 
> YW.
> 
> Gr. Stefan

Regards,
Davíð
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux