On Wed 27-04-16 14:59:15, Jens Axboe wrote: > On Wed, Apr 27 2016, Jens Axboe wrote: > > On Wed, Apr 27 2016, Jens Axboe wrote: > > > On 04/27/2016 12:01 PM, Jan Kara wrote: > > > >Hi, > > > > > > > >On Tue 26-04-16 09:55:23, Jens Axboe wrote: > > > >>Since the dawn of time, our background buffered writeback has sucked. > > > >>When we do background buffered writeback, it should have little impact > > > >>on foreground activity. That's the definition of background activity... > > > >>But for as long as I can remember, heavy buffered writers have not > > > >>behaved like that. For instance, if I do something like this: > > > >> > > > >>$ dd if=/dev/zero of=foo bs=1M count=10k > > > >> > > > >>on my laptop, and then try and start chrome, it basically won't start > > > >>before the buffered writeback is done. Or, for server oriented > > > >>workloads, where installation of a big RPM (or similar) adversely > > > >>impacts database reads or sync writes. When that happens, I get people > > > >>yelling at me. > > > >> > > > >>I have posted plenty of results previously, I'll keep it shorter > > > >>this time. Here's a run on my laptop, using read-to-pipe-async for > > > >>reading a 5g file, and rewriting it. You can find this test program > > > >>in the fio git repo. > > > > > > > >I have tested your patchset on my test system. Generally I have observed > > > >noticeable drop in average throughput for heavy background writes without > > > >any other disk activity and also somewhat increased variance in the > > > >runtimes. It is most visible on this simple testcases: > > > > > > > >dd if=/dev/zero of=/mnt/file bs=1M count=10000 > > > > > > > >and > > > > > > > >dd if=/dev/zero of=/mnt/file bs=1M count=10000 conv=fsync > > > > > > > >The machine has 4GB of ram, /mnt is an ext3 filesystem that is freshly > > > >created before each dd run on a dedicated disk. > > > > > > > >Without your patches I get pretty stable dd runtimes for both cases: > > > > > > > >dd if=/dev/zero of=/mnt/file bs=1M count=10000 > > > >Runtimes: 87.9611 87.3279 87.2554 > > > > > > > >dd if=/dev/zero of=/mnt/file bs=1M count=10000 conv=fsync > > > >Runtimes: 93.3502 93.2086 93.541 > > > > > > > >With your patches the numbers look like: > > > > > > > >dd if=/dev/zero of=/mnt/file bs=1M count=10000 > > > >Runtimes: 108.183, 97.184, 99.9587 > > > > > > > >dd if=/dev/zero of=/mnt/file bs=1M count=10000 conv=fsync > > > >Runtimes: 104.9, 102.775, 102.892 > > > > > > > >I have checked whether the variance is due to some interaction with CFQ > > > >which is used for the disk. When I switched the disk to deadline, I still > > > >get some variance although, the throughput is still ~10% lower: > > > > > > > >dd if=/dev/zero of=/mnt/file bs=1M count=10000 > > > >Runtimes: 100.417 100.643 100.866 > > > > > > > >dd if=/dev/zero of=/mnt/file bs=1M count=10000 conv=fsync > > > >Runtimes: 104.208 106.341 105.483 > > > > > > > >The disk is rotational SATA drive with writeback cache, queue depth of the > > > >disk reported in /sys/block/sdb/device/queue_depth is 1. > > > > > > > >So I think we still need some tweaking on the low end of the storage > > > >spectrum so that we don't lose 10% of throughput for simple cases like > > > >this. > > > > > > Thanks for testing, Jan! I haven't tried old QD=1 SATA. I wonder if > > > you are seeing smaller requests, and that is why it both varies and > > > you get lower throughput? I'll try and setup a test here similar to > > > yours. > > > > Jan, care to try the below patch? I can't fully reproduce your issue on > > a SCSI disk limited to QD=1, but I have a feeling this might help. It's > > a bit of a hack, but the general idea is to allow one more request to > > build up for QD=1 devices. That eliminates wait time between one request > > finishing, and the next being submitted. > > That accidentally added a potentially stall, this one is both cleaner > and should have that fixed. > .. > - rwb->wb_max = 1 + ((depth - 1) >> min(31U, rwb->scale_step)); > - rwb->wb_normal = (rwb->wb_max + 1) / 2; > - rwb->wb_background = (rwb->wb_max + 3) / 4; > + if (rwb->queue_depth == 1) { > + rwb->wb_max = rwb->wb_normal = 2; > + rwb->wb_background = 1; This breaks the detection of too big scale_step in scale_up() where we key of wb_max == 1 value. However even with that fixed no luck :(: dd if=/dev/zero of=/mnt/file bs=1M count=10000 conv=fsync Runtime: 105.126 107.125 105.641 So about the same as before. I'll try to debug this later today... Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html