Re: [PATCH 4/5] bcache: writeback: collapse contiguous IO better

Michael Lyle <mlyle@xxxxxxxx> · Sat, 30 Sep 2017 00:33:10 -0700

Actually-- I give up.

You've initially bounced every single one of my patches, even the ones
that fix crash & data corruption bugs.

I spend 10x as much time fighting with you as writing stuff for
bcache, and basically every time it's turned out that you're wrong.

I will go do something else where it is just not so much work to get
something done.

Mike

On Sat, Sep 30, 2017 at 12:13 AM, Michael Lyle <mlyle@xxxxxxxx> wrote:
> Coly--
>
>
> On Fri, Sep 29, 2017 at 11:58 PM, Coly Li <i@xxxxxxx> wrote:
>> On 2017/9/30 上午11:17, Michael Lyle wrote:
>> [snip]
>>
>> If writeback_rate is not minimum value, it means there are front end
>> write requests existing.
>
> This is wrong.  Else we'd never make it back to the target.
>
>> In this case, backend writeback I/O should nice
>> I/O throughput to front end I/O. Otherwise, application will observe
>> increased I/O latency, especially when dirty percentage is not very
>> high. For enterprise workload, this change hurts performance.
>
> No, utilizing less of disk throughput for a given writeback rate
> improves workload latency.  :P  The maximum that will be aggregated in
> this way is constrained more strongly than the old code...
>
>> An desired behavior for low latency enterprise workload is, when dirty
>> percentage is low, once there is front end I/O, backend writeback should
>> be at minimum rate. This patch will introduce unstable and unpredictable
>> I/O latency.
>
> Nope.  It lowers disk utilization overall, and the amount of disk
> bandwidth any individual request chunk can use is explicitly bounded
> (unlike before, where it was not).
>
>> Unless there is performance bottleneck of writeback seeking, at least
>> enterprise users will focus more on front end I/O latency ....
>
> The less writeback seeks the disk, the more the real workload can seek
> this disk!  And if you're at high writeback rates, the vast majority
> of the time the disk is seeking!
>
>> This method is helpful only when writeback I/Os is not issued
>> continuously, other wise if they are issued within slice_idle,
>> underlying elevator will reorder or merge the I/Os in larger request.
>
> We have a subsystem in place to rate-limit and make sure that they are
> not issued continuously!  If you want to preserve good latency for
> userspace, you want to keep the system in the regime where that
> control system is effective!
>
>> Hmm, if you move the dirty IO from btree into dirty_io list, then
>> perform I/O, there is risk that once machine is power down during
>> writeback there might be dirty data lost. If you continuously issue
>> dirty I/O and remove them from btree at same time, that means you will
>> introduce more latency to front end I/O...
>
> No... this doesn't change the consistency properties.  I am just
> saying-- if we have (up to 5 contiguous things that we're going to
> write back), wait till they're all read; plug; dispatch writes ;
> unplug.
>
>> And plug list will be unplugged automatically as default, when context
>> switching happens. If you will performance read I/Os to the btrees, a
>> context switch is probably to happen, then you won't keep a large bio
>> lists ...
>
> Thankfully we'll have 5 things to fire immediately after each other,
> so we don't need to worry about automatic unplug.
>
>> IMHO when writeback rate is low, especially when backing hard disk is
>> not bottleneck, group continuous I/Os in bcache code does not help too
>> much for writeback performance. The only benefit is less I/O issued when
>> front end I/O is low or idle, but not most of users care about it,
>> especially enterprise users.
>
> I disagree!
>
>>> I believe patch 4 is useful on its own, but I have this and other
>>> pieces of development that depend upon it.
>>
>> Current bcache code works well in most of writeback loads, I just worry
>> that implementing an elevator in bcache writeback logic is a big
>> investment with a little return.
>
> Bcache already implements a (one-way) elevator!  Bcache invests
> **significant effort** already to do writebacks in LBA order (to
> effectively short-stroke the disk), but because of different arrival
> times of the read request they can end up reordered.  This is bad.
> This is a bug.
>
> Mike
>
>>
>> --
>> Coly Li
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html