Re: [PATCH 0/2][RFC] block: default to deadline for SMR devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/31/18 00:45, Jeff Moyer wrote:
> Jens Axboe <axboe@xxxxxxxxx> writes:
>> But what's the regression? 4.15 had no zone write locking at all.
> 
> The zone write locking was done in the sd driver prior to 4.16.  See
> commit 39051dd85f287 ("scsi: sd: Remove zone write locking") for where
> it was removed.  That means these devices "just worked" with all I/O
> schedulers.

Yes they did "just work", but that was not an ideal solution either
because of the performance implications: sequential writes to a single
zone where stalling the dispatch queue waiting for the dispatched write
to the locked zone to complete. That was not optimal at all (sure, the
drive side write caching was hiding this a bit, but still)

>>> Moving on, assuming your mind is made up...
>>>
>>> I'm not sure how much logic should go into the udev rule.  As mentioned,
>>> this limitation was introduced in 4.16, and Damien has plans to lift the
>>> restriction in future kernels.  Because distributions tend to cherry
>>> pick changes, making decisions on whether a feature exists based solely
>>> on kernel version is usually not a great thing.  My inclination would be
>>> to just always force deadline for host-managed SMR drives.  These drives
>>> aren't that popular, after all.  Any opinions on this?
>>
>> The problem is that it's tied to an IO scheduler, which ends up causing
>> issues like this, since users are free to select a different scheduler.
>> Then things break. Granted, in this case, some extraordinarily shitty
>> hardware even broke. That is on the hardware, not the kernel, that
>> kind of breakage should not occur.
> 
> If the firmware problem was widespread, I think we'd try to avoid it.  I
> have no reason to believe that is the case, though.

First time in my career that I heard of a disk breaking a system BIOS. I
will notify our system test lab to investigate this. Jeff, let's take
this discussion off-list since that is not kernel related.

> Damien made the argument that the user should be able to select an I/O
> scheduler that doesn't perform the write locking, because a well-behaved
> application could theoretically make use of it.  I think this is a weak
> argument, given that dm-zoned doesn't even support such a mode.

Yes, a little week. That is definitely not the main use case I am seeing
with customers. That said, these drives are starting to be used with
other feature sets being enabled like I/O priorities. Considering there
size, this is a very interesting feature to control access latency.
Deadline and mq-deadline will not act on I/O priorities, cfq (and bfq ?)
will. Potentially better results can be achieved, but as these
schedulers do not support zone write locking, the application needs to
be careful with its per-zone write queue depth and doing so avoid
tripping over kernel level command unintended reordering. I see this as
a valid enough use case to not "lock-down" the scheduler to deadline
only and allow others schedulers too. Yet, deadline should be the
default until an application asks for something else if needed.

dm-zoned or f2fs (btrfs in the lab too) "assume" that the underlying
stack does the right thing. That is of course true (for now) only if the
deadline scheduler is enabled. A sane default set on device
initialization would be nice to have and avoid potential headaches with
rule ordering with regard to components initialization (not to mention
that this would make booting from these disks possible).

> I definitely see this udev rule as a temporary workaround.

I agree. In fact I see the deadline based zone write locking itself as a
temporary workaround. For now, I do not see any other clean method that
covers both mq and legacy path. Considering only mq, we discussed
interesting possibilities at LSFMM using dedicated write queues. That
could be handled generically and remove the dependency on the scheduler
while also increasing coverage of the support to open channel SSDs as well.

My guess is that no major change for this write locking will happen on
the legacy path, which hopefully will go away soon (?). But there are
options forward with blk-mq.

>> So now we're stuck with this temporary situation which needs a work-around.
>> I don't think it's a terrible idea to have a rule that just sets
>> deadline/mq-deadline for an SMR device regardless of what kernel it is
>> running on. It'll probably never be a bad default.

I agree. But since there are other kernel components (dm-zoned, FSes and
the entire fs/block-dev.c direct I/O write path) depending on the
scheduler to be set to something sane, setting that early on in the
device initialization before it is grabbed by an FS or a device mapper
would definitely be nice to have.

> 
> OK.  Barring future input to the contrary, I'll work to get updates into
> fedora, at least.  I've CC'd Colin and Hannes.  I'm not sure who else to
> include.
> 
> FYI, below is the udev rule Damien had provided to Bryan.  I'm not sure
> about the KERNEL=="sd[a-z]" bit, that may need modification.  Note: I'm
> no udev expert.

It probably needs to be something like KERNEL=="sd*" to allow more than
26 drives.

Best regards.

> 
> Cheers,
> Jeff
> 
> ACTION=="add|change", KERNEL=="sd[a-z]",
> ATTRS{queue/zoned}=="host-managed", ATTR{queue/scheduler}="deadline"
> 

-- 
Damien Le Moal,
Western Digital




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux