Re: [Warning Possible spam] Re: [Warning Possible spam] Re: Ceph Bluestore tweaks for Bcache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Frank,
Yes, I think you have got to the crux of the issue.
> - some_config_value_hdd is used for "rotational=0" devices and
> - osd/class:hdd values are used for "device_class=hdd" OSDs,

The class is something that is user defined and you can actually
define your own class names. By default the class is set to ssd for
rotational=0 and hdd for rotational=1. I override this so my osds end
up in the right pools as my pools are class based. I also have another
class called nvme for all nvme storage.
So the rotational=0 and the class=ssd are actually disconnected and
used for two different purposes.

> Or are you observing that an HDD+bcache OSD comes up in device class hdd *but* bluestore thinks it is an ssd and applies SSD defaults (some_config_value_ssd) *unless* you explicitly set the config option for device class hdd?

Yes, this is what I am observing, because I am manually changing the
device class to HDD.

> - OSD is prepared on HDD and put into device class hdd (with correct persistent prepare-time options)
> - bcache is added *after* OSD creation (???)
> - after this, on (re-)start the OSD comes up in device class hdd but bluestore thinks now its an SSD and uses some incorrect run-time config option defaults
> - to fix the incorrect run-time options, you explicitly copy some hdd-defaults to the config data base with filter "osd/class:hdd"

In my process, bcache is added before osd creation as bcache creates a
disk device called /dev/bcache0 for example. This is used for the data
disk. As you have surmised bluestore thinks my disks are ssd and
applies settings as such. I set the class to HDD and then I correct
runtime settings based on the class.

> There is actually an interesting follow up on this. With bcache/dm_cache large enough it should make sense to use SSD rocks-DB settings, because the data base will fit into the cache. Are there any recommendations for tweaking the prepare-time config options, in particular, the rocks-db options for such hybrid drives?

In my case, this doesn't apply as I have used volumes on the ssd
specifically for the db. This means I know the db will always be on
the fast storage.
But yes, a larger cache size may change the performance and make it
closer to what ceph expects from an ssd. In my experience the ssd
settings made performance considerably worse than the hdd settings (3x
average latency) on bcache.

Regards,
Rich

On Fri, 8 Apr 2022 at 02:03, Frank Schilder <frans@xxxxxx> wrote:
>
> Hi Richard,
>
> so you are tweaking run-time config values, not OSD prepare-time config values. There is something I don't understand here:
>
> > What I do for my settings is to set them for the hdd class (ceph config set osd/class:hdd bluestore_setting_blah=blahblah.
> > I think that's the correct syntax, but I'm not currently at a computer) in the config database.
>
> If the OSD comes up as class=hdd, then the hdd defaults should be applied any way and there is no point setting these values explicitly to their defaults. How do you make the OSD come up in class hdd, wasn't it your original problem that the OSDs came up in class ssd? Or are you observing that an HDD+bcache OSD comes up in device class hdd *but* bluestore thinks it is an ssd and applies SSD defaults (some_config_value_ssd) *unless* you explicitly set the config option for device class hdd?
>
> I think I am confused about the OSD device class, the drive type detected by bluestore and what options are used if there is a mis-match - if there is any. If I understand you correctly, it seems you observe that:
>
> - OSD is prepared on HDD and put into device class hdd (with correct persistent prepare-time options)
> - bcache is added *after* OSD creation (???)
> - after this, on (re-)start the OSD comes up in device class hdd but bluestore thinks now its an SSD and uses some incorrect run-time config option defaults
> - to fix the incorrect run-time options, you explicitly copy some hdd-defaults to the config data base with filter "osd/class:hdd"
>
> If this is correct, then I believe the underlying issue is that:
>
> - some_config_value_hdd is used for "rotational=0" devices and
> - osd/class:hdd values are used for "device_class=hdd" OSDs,
>
> which is not the same despite the string "hdd" indicating that it is.
>
> There is actually an interesting follow up on this. With bcache/dm_cache large enough it should make sense to use SSD rocks-DB settings, because the data base will fit into the cache. Are there any recommendations for tweaking the prepare-time config options, in particular, the rocks-db options for such hybrid drives?
>
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux