Re: [PATCH] dax: remove the pmem_dax_ops->flush abstraction

Dan Williams <dan.j.williams@xxxxxxxxx> · Wed, 6 Sep 2017 08:46:55 -0700

[ resend in plain text ]

On Tue, Sep 5, 2017 at 11:29 PM, Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote:
>
>
> On Thu, 31 Aug 2017, Dan Williams wrote:
>
>> On Thu, Aug 31, 2017 at 8:04 PM, Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote:
>> >
>> >
>> > On Thu, 31 Aug 2017, Dan Williams wrote:
>> >
>> >> On Thu, Aug 31, 2017 at 6:47 PM, Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote:
>> >> > The patch abebfbe2f731 ("dm: add ->flush() dax operation support") is
>> >> > buggy. A dm device may be composed of multiple underlying devices and all
>> >> > of them need to be flushed. The patch just routes the flush request to the
>> >> > first device and ignores the other devices.
>> >> >
>> >> > It could be fixed by adding more complex logic to the device mapper. But
>> >> > there is only one implementation of the method pmem_dax_ops->flush - that
>> >> > is pmem_dax_flush() - and it calls arch_wb_cache_pmem(). Consequently, we
>> >> > don't need the pmem_dax_ops->flush abstraction at all, we can call
>> >> > arch_wb_cache_pmem() directly from dax_flush() because dax_dev->ops->flush
>> >> > couldn't ever reach anything different from arch_wb_cache_pmem().
>> >>
>> >> Unfortunately, this is not true, see usage of DAXDEV_WRITE_CACHE. This
>> >> is for platforms that arrange for cpu caches to be flushed on
>> >> power-fail, like standalone storage appliances, where it would be a
>> >
>> > BTW. if the system is capable of flushing caches on power failure, it is
>> > system-wide feature, it is not per-device feature. The CPU caches may
>> > store data for any persistent memory device.
>> >
>> > You either have enough remaining power to flush the CPU caches or not.
>>
>> Whether you need to flush caches or not can be a per-address range
>> attribute and can change over time.
>
> There is only one way to flush the cache (the clwb instruction), you don't
> need a method that abstracts over this instruction. A simple flag or
> static key that skips clwb() is sufficient.
>
> Linux kernel API is changeable, so you don't need to overdesign the API
> for future. If it ever happens that we have a system with two types of
> persistent memory that need different instructions to flush them, we can
> design a proper abstraction. But there is no reason to design the
> abstraction in advance for a situation that will perhaps never happen.

True, it is over designed and we can cross that complexity bridge in
the future when / if it arises.

Since dm is already checking dax_write_cache_enabled() to set its own
flush enabled bit I think your patch looks good. You can add:

Reviewed-by: Dan Williams <dan.j.williams@xxxxxxxxx>

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel