Re: BlueStore not surviving power outage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 28/04/2021 09:13, Janne Johansson wrote:
Den ons 28 apr. 2021 kl 04:25 skrev Xuehan Xu <xxhdx1985126@xxxxxxxxx>:
There is a RAID HBA in each of the machines in our clusters, to which
all SATA disks are attached. We configured the RAID HBA cache mode to
"write through", but, as I checked yesterday, the BBU of the RAID HBAs
are not charged. I'm not quite sure whether the BBU has something to
do with the data loss, as far as I know, all data should be persisted
to the underlying disk before acknowledging upper layer systems when
cache mode is "write through". Am I missing anything? Thanks:-)

If the raid card was good, it would change caching strategy when/if
the BBU has no power left, but if it didn't and "it was good when we
last booted up", then it is possible that it 'promised' that writing
to BBU-backed RAM was ok for ack'ing the writes even if they are not
on disk yet, and when the BBU failed (for whatever reason), then this
promise was not honored and lots of writes were lost.


In addition: In 2020 I have seen two cases (as a Ceph consultant) of severe data corruption with BlueStore after a power failure.

In both cases this happened on systems where an HBA was involved. In the end we blaimed the HBAs which were in RAID mode.

I have done extensive power failure testing afterwards on NVMe-only and on systems with HBAs in JBOD mode and I was never able to reproduce the data corruption after a power failure.

My suspicion is still that the HBAs were caching some data and it was not written to the medium before the power failed although BlueStore was told it was.

My bet: This is the HBA, not BlueStore's fault.

Wido
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx



[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux