Re: [PATCH] block: BFQ default for single queue devices

Paolo Valente <paolo.valente@xxxxxxxxxx> · Fri, 5 Oct 2018 11:28:07 +0200

> Il giorno 05 ott 2018, alle ore 00:42, Bart Van Assche <bvanassche@xxxxxxx> ha scritto:
> 
> On Thu, 2018-10-04 at 22:39 +0200, Paolo Valente wrote:
>> No, kernel build is, for evident reasons, one of the workloads I cared
>> most about.  Actually, I tried to focus on all my main
>> kernel-development tasks, such as also git checkout, git merge, git
>> grep, ...
>> 
>> According to my test results, with BFQ these tasks are at least as
>> fast as, or, in most system configurations, much faster than with the
>> other schedulers.  Of course, at the same time the system also remains
>> responsive with BFQ.
>> 
>> You can repeat these tests using one of my first scripts in the S
>> suite: kern_dev_tasks_vs_rw.sh (usually, the older the tests, the more
>> hypertrophied the names I gave :) ).
>> 
>> I stopped sharing also my kernel-build results years ago, because I
>> went on obtaining the same, identical good results for years, and I'm
>> aware that I tend to show and say too much stuff.
> 
> On my test setup building the kernel is slightly slower when using the BFQ
> scheduler compared to using scheduler "none" (kernel 4.18.12, NVMe SSD,
> single CPU with 6 cores, hyperthreading disabled). I am aware that the
> proposal at the start of this thread was to make BFQ the default for devices
> with a single hardware queue and not for devices like NVMe SSDs that support
> multiple hardware queues.
> 

I miss your point: as you yourself note, the proposal is limited to
single-queue devices, exactly because BFQ is not ready for
multiple-queue devices yet.

> What I think is missing is measurement results for BFQ on a system with
> multiple CPU sockets and against a fast storage medium.

It is not missing.  As I happened to report in previous threads, we
made a script to measure that too [1], using fio and null block.

I have reported the results we obtained, for three classes of
processors, in the in-kernel BFQ documentation [2].

In particular, BFQ reached 400KIOPS with the fastest CPU mentioned in
that document (Intel i7-4850HQ).

So, since the speed of that single-socket commodity CPU is most likely
lower than the total speed of a multi-socket system, we have that, on
such a system and with BFQ, you should be conservatively ok with
single-queue devices in the range 300-500 KIOPS.

[1] https://github.com/Algodev-github/IOSpeed
[2] https://www.kernel.org/doc/Documentation/block/bfq-iosched.txt

>  

> Eliminating
> the host lock from the SCSI core yielded a significant performance
> improvement for such storage devices. Since the BFQ scheduler locks and
> unlocks bfqd->lock for every dispatch operation it is very likely that BFQ
> will slow down I/O for fast storage devices, even if their driver only
> creates a single hardware queue.
> 

One of the main motivations behind NVMe, and blk-mq itself, is that it
is hard to reach the above IOPS, and more, with a single I/O queue as
bottleneck.

So, I wouldn't expect that systems
- equipped with single-queue drives reaching more than 500 KIOPS
- using SATA or some other non-NVMe as protocol
- so fast to push these drives to their maximum speeds
constitute more than a negligible percentage of devices.

So, by sticking to mq-deadline, we would sacrifice 99% of systems, to
make sure, basically, that those very few systems on steroids reach
maximum throughput with random I/O (while however still suffering from
responsiveness problems).  I think it makes much more sense to have as
default what is best for 99% of the single-queue systems, with those
super systems properly reconfigured by their users.  For sure, other
defaults are to be changed too, to get the most out of those systems.

Thanks,
Paolo

> Bart.