Re: has anyone enabled bdev_enable_discard?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We've had a specific set of drives that we've had to enable bdev_enable_discard and bdev_async_discard for in order to maintain acceptable performance on block clusters. I wrote the patch that Igor mentioned in order to try and send more parallel discards to the devices, but these ones in particular seem to process them in serial (based on observed discard counts and latency going to the device), which is unfortunate. We're also testing new firmware that suggests it should help alleviate some of the initial concerns we had about discards not keeping up which prompted the patch in the first place.

Most of our drives do not need discards enabled (and definitely not without async) in order to maintain performance unless we're doing a full disk fio test or something like that where we're trying to find its cliff profile. We've used OSD classes to help target the options being applied to specific OSDs via centralized conf which helps when we would add new hosts that may have different drives so that the options weren't applied globally.

Based on our experience, I wouldn't enable it unless you're seeing some sort of cliff-like behaviour as your OSDs run low on free space, or are heavily fragmented. I would also deem bdev_async_enabled = 1 to be a requirement so that it doesn't block user IO. Keep an eye on your discards being sent to devices and the discard latency, as well (via node_exporter, for example).

Matt


On 2024-03-02 06:18, David C. wrote:
I came across an enterprise NVMe used for BlueFS DB whose performance
dropped sharply after a few months of delivery (I won't mention the brand
here but it was not among these 3: Intel, Samsung, Micron).
It is clear that enabling bdev_enable_discard impacted performance, but
this option also saved the platform after a few days of discard.

IMHO the most important thing is to validate the behavior when there has
been a write to the entire flash media.
But this option has the merit of existing.

it seems to me that the ideal would be not to have several options on
bdev_*discard, and that this task should be asynchronous and with the
(D)iscard instructions during a calmer period of activity (I do not see any
impact if the instructions are lost during an OSD reboot)


Le ven. 1 mars 2024 à 19:17, Igor Fedotov <igor.fedotov@xxxxxxxx> a écrit :

I played with this feature a while ago and recall it had visible
negative impact on user operations due to the need to submit tons of
discard operations - effectively each data overwrite operation triggers
one or more discard operation submission to disk.

And I doubt this has been widely used if any.

Nevertheless recently we've got a PR to rework some aspects of thread
management for this stuff, see https://github.com/ceph/ceph/pull/55469

The author claimed they needed this feature for their cluster so you
might want to ask him about their user experience.


W.r.t documentation - actually there are just two options

- bdev_enable_discard - enables issuing discard to disk

- bdev_async_discard - instructs whether discard requests are issued
synchronously (along with disk extents release) or asynchronously (using
a background thread).

Thanks,

Igor

On 01/03/2024 13:06, jsterr@xxxxxxxxxxxx wrote:
> Is there any update on this? Did someone test the option and has
> performance values before and after?
> Is there any good documentation regarding this option?
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux