Re: bluestore_cache_size_ssd and bluestore_cache_size_hdd default values

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 03/16/2018 05:08 PM, Sage Weil wrote:
> On Fri, 16 Mar 2018, Wido den Hollander wrote:
>> Hi,
>>
>> The config values bluestore_cache_size_ssd and bluestore_cache_size_hdd
>> determine how much memory a OSD running with Bluestore will use for caching.
>>
>> By default the values are:
>>
>> bluestore_cache_size_ssd = 3GB
>> bluestore_cache_size_hdd = 1GB
>>
>> I've seen some cases recently where users migrated from FileStore to
>> BlueStore and had the OOM-killer come along during backfill/recovery
>> siautions. These are the situations where OSDs require more memory.
>>
>> It's not uncommon to find servers with:
>>
>> - 8 SSDs and 32GB RAM
>> - 16 SSDs and 64GB RAM
>>
>> With FileStore it was sufficient since the page cache did all the work,
>> but with BlueStore each OSD has it's own cache which isn't shared.
>>
>> In addition there is the regular memory consumption and the overhead of
>> the cache.
>>
>> I also don't understand the ideas behind the values. As HDDs are slower
>> the usually require more cache then SSDs, so I'd expect the values to be
>> flipped.
>>
>> My recommendation would be to lower the value to 1GB to prevent users
>> from having a bad experience when going from FileStore to BlueStore.
>>
>> I have created a pull request for this:
>> https://github.com/ceph/ceph/pull/20940
>>
>> Opinions, experiences, feedback?
> 
> The thinking was that bluestore requires some deliberate thinking 
> and tuning on the cache size, so we may as well pick defaults that make 
> sense.  Since the admin is doing the filestore -> bluestore conversion, 
> that is the point where they consider the memory requirement and adjust 
> the config as necessary.
> 

I understand the thinking, but I think it doesn't apply.

> As for why the defaults are different, the SSDs need a larger cache to 
> capture the SSD performance, and the nodes that have them are likely to be 
> "higher end" and have more memory.  The idea is the minimize the number 
> of people that will need to adjust their config.
> 

Is 3GB really required for a OSD? Or might 2GB also work?

> Perhaps the missing piece here is that the filestore->bluestore conversion 
> doc should have a section about memory requirements and tuning 
> bluestore_cache_size accordingly?  If we just reduce the default to 
> satisfy the lowest common denominator we'll kill performance for the 
> majority that has more memory.
> 

Is 8 OSDs on 32GB really that low? If we look at the docs:
http://docs.ceph.com/docs/master/start/hardware-recommendations/#ram

"OSDs do not require as much RAM for regular operations (e.g., 500MB of
RAM per daemon instance); however, during recovery they need
significantly more RAM (e.g., ~1GB per 1TB of storage per daemon).
Generally, more RAM is better."

So somebody who has machines which were running just fine with FileStore
for the last 2 years don't expect that when switching to BlueStore they
have to look into this.

They are however faced with the OOM-killer and frustrated people in
their organization. Who at that point blame BlueStore.

I've been called for these kind of situations a few times in the last
months and in all cases I had to lower the bluestore cache size.

Yes, I do agree that people should read the Release Notes, but is that
really sufficient? Not everybody will do that.

I'd say:

- Lower cache size for SSD to 1GB or 2GB
- Update the docs to tell people to increase the cache to improve
performance

The OOM-killer can become a true snowball effect in clusters which is a
serious issues for people. I'd rather have a slightly lower performance
then daemons going down.

Wido

> sage
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux