Re: high overhead of functions blkg_*stats_* in bfq

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Il giorno 06 nov 2017, alle ore 11:48, Ulf Hansson <ulf.hansson@xxxxxxxxxx> ha scritto:
> 
> On 6 November 2017 at 10:49, Paolo Valente <paolo.valente@xxxxxxxxxx> wrote:
>> 
>>> Il giorno 06 nov 2017, alle ore 10:22, Ulf Hansson <ulf.hansson@xxxxxxxxxx> ha scritto:
>>> 
>>> On 6 November 2017 at 03:21, Jens Axboe <axboe@xxxxxxxxx> wrote:
>>>> On 11/05/2017 01:39 AM, Paolo Valente wrote:
>>>>> 
>>>>>> Il giorno 18 ott 2017, alle ore 15:19, Tejun Heo <tj@xxxxxxxxxx> ha scritto:
>>>>>> 
>>>>>> Hello, Paolo.
>>>>>> 
>>>>>> On Tue, Oct 17, 2017 at 12:11:01PM +0200, Paolo Valente wrote:
>>>>>> ...
>>>>>>> protected by a per-device scheduler lock.  To give you an idea, on an
>>>>>>> Intel i7-4850HQ, and with 8 threads doing random I/O in parallel on
>>>>>>> null_blk (configured with 0 latency), if the update of groups stats is
>>>>>>> removed, then the throughput grows from 260 to 404 KIOPS.  This and
>>>>>>> all the other results we might share in this thread can be reproduced
>>>>>>> very easily with a (useful) script made by Luca Miccio [1].
>>>>>> 
>>>>>> I don't think the old request_queue is ever built for multiple CPUs
>>>>>> hitting on a mem-backed device.
>>>>>> 
>>>>> 
>>>>> Hi,
>>>>> from our measurements, the code and the comments received so far in
>>>>> this thread, I guess that reducing the execution time of blkg_*stats_*
>>>>> functions is not an easy task, and is unlikely to be accomplished in
>>>>> the short term.  In this respect, we have unfortunately found out that
>>>>> executing these functions causes a very high reduction of the
>>>>> sustainable throughput on some CPUs.  For example, -70% on an ARM
>>>>> CortexTM-A53 Octa-core.
>>>>> 
>>>>> Thus, to deal with such a considerable slowdown, until the overhead of
>>>>> these functions gets reduced, it may make more sense to switch the
>>>>> update of these statistics off, in all cases where these statistics
>>>>> are not used, while higher performance (or lower power consumption) is
>>>>> welcome/needed.
>>>>> 
>>>>> We wondered, however, how hazardous it might be to switch the update
>>>>> of these statistics off.  To answer this question, we investigated the
>>>>> extent at which these statistics are used by applications and
>>>>> services.  Mainly, we tried to survey relevant people or
>>>>> forums/mailing lists for involved communities: Linux distributions,
>>>>> systemd, containers and other minor communities.  Nobody reported any
>>>>> application or service using these statistics (either the variant
>>>>> updated by bfq, or that updated by cfq).
>>>>> 
>>>>> So, one of the patches we are working on gives the user the
>>>>> possibility to disable the update of these statistics online.
>>>> 
>>>> If you want help with this, provide an easy way to reproduce this,
>>>> and/or some decent profiling output. There was one flamegraph posted,
>>>> but that was basically useless. Just do:
>>>> 
>>>> perf record -g -- whatever test
>>>> perf report -g --no-children
>>>> 
>>>> and post the top 10 entries from the perf report.
>>>> 
>>>> It's pointless to give up on this so soon, when no effort has apparently
>>>> been dedicated to figuring out what the actual issue is yet. So no, no
>>>> patch that will just disable the stats is going to be accepted.
>>>> 
>>>> That said, I have no idea who uses these stats. Surely someone can
>>>> answer that question. Tejun?
>>> 
>>> Jens, Tejun, apologize for side-tracking the discussion.
>>> 
>>> It sounds to me that these stats should have been put into debugfs,
>>> rather than sysfs from the beginning.
>>> 
>> 
>> Ulf,
>> let me just add a bit of info, if useful: four of those stat files are
>> explicitly meant for debugging (as per the documentation), and created
>> if CONFIG_DEBUG_BLK_CGROUP=y.
>> 
>> Paolo
> 
> Right, so it's a mixture of debugfs/sysfs then.
> 
> In the BFQ case, it seems like CONFIG_DEBUG_BLK_CGROUP isn't checked.
> I assume that should be changed,

Yes, it's in the patch series that we would like to propose.

> which would remove at least some of
> the computation overhead when when this Kconfig is unset.
> 

Yes. My concern is the following.

If, as our one-month survey apparently confirms, there is no code
using these bfq stats, in particular the DEBUG_BLK_CGROUP ones,
then it sounds unfair to make a bfq user suffer from up to 70% loss
of sustainable throughput, just because, for reasons that the
user may even ignore, the 'wrong' options are set for his/her system (for
example, because DEBUG_BLK_CGROUP stats happen or happened to be used
for cfq in that system, and in legacy blk one can't even tell the
difference between switching on or off the update of those stats).

Of course, this is just a point of view, and hold only until those
update functions get possibly optimized.

Thanks,
Paolo

> Perhaps one may even consider moving all stats for BFQ within that
> Kconfig (and for other mq-schedulers if those ever intends to
> implement support for the stats).
> 
> Kind regards
> Uffe





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux