Re: has anyone enabled bdev_enable_discard?

Joshua Baergen <jbaergen@xxxxxxxxxxxxxxxx> · Sat, 2 Mar 2024 07:47:49 -0700

Periodic discard was actually attempted in the past:
https://github.com/ceph/ceph/pull/20723

A proper implementation would probably need appropriate
scheduling/throttling that can be tuned so as to balance against
client I/O impact.

Josh

On Sat, Mar 2, 2024 at 6:20 AM David C. <david.casier@xxxxxxxx> wrote:
>
> Could we not consider setting up a “bluefstrim” which could be orchestrated
> ?
>
> This would avoid having a continuous stream of (D)iscard instructions on
> the disks during activity.
>
> A weekly (probably monthly) bluefstrim could probably be enough for
> platforms that really need it.
>
>
> Le sam. 2 mars 2024 à 12:58, Matt Vandermeulen <storage@xxxxxxxxxxxx> a
> écrit :
>
> > We've had a specific set of drives that we've had to enable
> > bdev_enable_discard and bdev_async_discard for in order to maintain
> > acceptable performance on block clusters. I wrote the patch that Igor
> > mentioned in order to try and send more parallel discards to the
> > devices, but these ones in particular seem to process them in serial
> > (based on observed discard counts and latency going to the device),
> > which is unfortunate. We're also testing new firmware that suggests it
> > should help alleviate some of the initial concerns we had about discards
> > not keeping up which prompted the patch in the first place.
> >
> > Most of our drives do not need discards enabled (and definitely not
> > without async) in order to maintain performance unless we're doing a
> > full disk fio test or something like that where we're trying to find its
> > cliff profile. We've used OSD classes to help target the options being
> > applied to specific OSDs via centralized conf which helps when we would
> > add new hosts that may have different drives so that the options weren't
> > applied globally.
> >
> > Based on our experience, I wouldn't enable it unless you're seeing some
> > sort of cliff-like behaviour as your OSDs run low on free space, or are
> > heavily fragmented. I would also deem bdev_async_enabled = 1 to be a
> > requirement so that it doesn't block user IO. Keep an eye on your
> > discards being sent to devices and the discard latency, as well (via
> > node_exporter, for example).
> >
> > Matt
> >
> >
> > On 2024-03-02 06:18, David C. wrote:
> > > I came across an enterprise NVMe used for BlueFS DB whose performance
> > > dropped sharply after a few months of delivery (I won't mention the
> > > brand
> > > here but it was not among these 3: Intel, Samsung, Micron).
> > > It is clear that enabling bdev_enable_discard impacted performance, but
> > > this option also saved the platform after a few days of discard.
> > >
> > > IMHO the most important thing is to validate the behavior when there
> > > has
> > > been a write to the entire flash media.
> > > But this option has the merit of existing.
> > >
> > > it seems to me that the ideal would be not to have several options on
> > > bdev_*discard, and that this task should be asynchronous and with the
> > > (D)iscard instructions during a calmer period of activity (I do not see
> > > any
> > > impact if the instructions are lost during an OSD reboot)
> > >
> > >
> > > Le ven. 1 mars 2024 à 19:17, Igor Fedotov <igor.fedotov@xxxxxxxx> a
> > > écrit :
> > >
> > >> I played with this feature a while ago and recall it had visible
> > >> negative impact on user operations due to the need to submit tons of
> > >> discard operations - effectively each data overwrite operation
> > >> triggers
> > >> one or more discard operation submission to disk.
> > >>
> > >> And I doubt this has been widely used if any.
> > >>
> > >> Nevertheless recently we've got a PR to rework some aspects of thread
> > >> management for this stuff, see https://github.com/ceph/ceph/pull/55469
> > >>
> > >> The author claimed they needed this feature for their cluster so you
> > >> might want to ask him about their user experience.
> > >>
> > >>
> > >> W.r.t documentation - actually there are just two options
> > >>
> > >> - bdev_enable_discard - enables issuing discard to disk
> > >>
> > >> - bdev_async_discard - instructs whether discard requests are issued
> > >> synchronously (along with disk extents release) or asynchronously
> > >> (using
> > >> a background thread).
> > >>
> > >> Thanks,
> > >>
> > >> Igor
> > >>
> > >> On 01/03/2024 13:06, jsterr@xxxxxxxxxxxx wrote:
> > >> > Is there any update on this? Did someone test the option and has
> > >> > performance values before and after?
> > >> > Is there any good documentation regarding this option?
> > >> > _______________________________________________
> > >> > ceph-users mailing list -- ceph-users@xxxxxxx
> > >> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> > >> _______________________________________________
> > >> ceph-users mailing list -- ceph-users@xxxxxxx
> > >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> > >>
> > > _______________________________________________
> > > ceph-users mailing list -- ceph-users@xxxxxxx
> > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx