Re: High ceph_osd_commit_latency_ms on Toshiba MG07ACA14TE HDDs

Reed Dier <reed.dier@xxxxxxxxxxx> · Wed, 24 Jun 2020 11:06:09 -0500

Just throwing my hat in here with a small bit of anecdotal experience.

In the early days of experimenting with ceph, I had 24x 8T disk, all behind RAID controllers as R0 vd's with no BBU (so controller cache is WT, default value), and pdcache (disk write cache) enabled (default value).

We had a lightning strike at our previous data center that killed power, and we ended up losing the entire ceph pool (not prod), due mostly in part to the pdcache setting.

We then did an exhaustive failure test following that, further isolating the pdcache as the culprit, and not the controllers write cache. The controllers now have BBU's to further prevent issues, but WB cache with the BBU did not yield issues, only pdcache.

So, all of this to say, in my experience, the on-disk write cache was a huge liability for losing writes.
This was also in the filestore days, and most of our issues were with XFS, but the point remains.

Write cache can be a consistency killer, and I recommend disabling where possible.

Reed

> On Jun 24, 2020, at 10:30 AM, Paul Emmerich <paul.emmerich@xxxxxxxx> wrote:
> 
> Has anyone ever encountered a drive with a write cache that actually
> *helped*?
> I haven't.
> 
> As in: would it be a good idea for the OSD to just disable the write cache
> on startup? Worst case it doesn't do anything, best case it improves
> latency.
> 
> Paul
> 
> -- 
> Paul Emmerich
> 
> Looking for help with your Ceph cluster? Contact us at https://croit.io
> 
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
> 
> 
> On Wed, Jun 24, 2020 at 3:49 PM Frank R <frankaritchie@xxxxxxxxx> wrote:
> 
>> fyi, there is an interesting note on disabling the write cache here:
>> 
>> 
>> https://yourcmc.ru/wiki/index.php?title=Ceph_performance&mobileaction=toggle_view_desktop#Drive_cache_is_slowing_you_down
>> 
>> On Wed, Jun 24, 2020 at 9:45 AM Benoît Knecht <bknecht@xxxxxxxxxxxxx>
>> wrote:
>>> 
>>> Hi Igor,
>>> 
>>> Igor Fedotov wrote:
>>>> for the sake of completeness one more experiment please if possible:
>>>> 
>>>> turn off write cache for HGST drives and measure commit latency once
>> again.
>>> 
>>> I just did the same experiment with HGST drives, and disabling the write
>> cache
>>> on those drives brought the latency down from about 7.5ms to about 4ms.
>>> 
>>> So it seems disabling the write cache across the board would be
>> advisable in
>>> our case. Is it recommended in general, or specifically when the DB+WAL
>> is on
>>> the same hard drive?
>>> 
>>> Stefan, Mark, are you disabling the write cache on your HDDs by default?
>>> 
>>> Cheers,
>>> 
>>> --
>>> Ben
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>> 
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

Attachment:
smime.p7s

Description: S/MIME cryptographic signature
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx