Re: [PATCH 4/5] bcache: writeback: collapse contiguous IO better

Coly Li <i@xxxxxxx> · Sat, 7 Oct 2017 02:09:05 +0800

Hi Mike,

On 2017/10/7 上午1:36, Michael Lyle wrote:
> It's 240GB (half of a Samsung 850) in front of a 1TB 5400RPM disk.
> 

Copied.

> The size isn't critical.  1/8 is chosen to exceed 10% (default
> writeback dirty data thresh), it might need to be 1/6 on really big
> environments.  It needs to be big enough that it takes more than 100
> seconds to write back, but less than 40% of the cache[1].
>

OK, I will do similar thing.

> I am also running this on "mountain", which is my virtualization
> server.  It has 2x Samsung 950 PRO 512GB NVMe disks (MD raid1) in
> front of 2x Seagate Ironwolf 6TB (MD raid1) drivers.  I can't run
> benchmarks on it, but it runs nightly builds and is otherwise lightly
> loaded in the middle of night.  It does writeback for about 75 minutes
> on average after a set of nightly builds of my repositories (last 5
> days 75, 79, 77, 72, 76); without this patch set it was more like 80
> (77, 80, 74, 85, 84, 76, 82, 84).  It has 128GB of RAM and doesn't do

Is it the time from writeback starts to dirty reaches dirty target, or
the time from writeback starts to dirty reaches 0 ?

> many reads-- most of its hottest working set hits the page cache--
> bcache serves to accelerate the random writes its workloads generate.
> This is the workload that I sought to improve with the writeback
> changes.
> 
> When you ran the tests, did you compare 1-5 to 1-3?  Because

Yes, we did same thing. I only add/remove the last 2 patches about bio
reorder, the first 3 patches are always applied in my testing kernels.

> otherwise, another uncontrolled factor is the writeback tuning is
> drastically different with patch 2.  (On most systems patch 2 makes
> writeback less aggressive overall by default).

I observe this behavior :-) I always use the new PI controller in my
testings, it works fine.

> [1]: If you don't think this is fair--- this would match how I've been
> suggesting enterprise users set up bcache for various workloads-- a
> writeback cache isn't going to be very effective at lowering write
> workload if the size of the most active write working set doesn't fit
> nicely in the cache.  so e.g. if you deploy under a SQL database the
> cache should be double the size of the indices to get a real write
> aggregation benefit; if you deploy under an environment doing builds
> the cache should be double the size of the amount written during
> builds; if you deploy under an environment doing rapid VM provisioning
> the cache should be double the size of the images blitted per
> provisioning cycle, etc.
> 

If I use a 1.8T hard disk as cached device, and 1TB SSD as cache device,
and set fio to write 500G dirty data in total. Is this configuration
close to the working set and cache size you suggested ?

[snip]

Thanks.

-- 
Coly Li