Re: [PATCH RFC 10/22] block, bfq: add full hierarchical scheduling and cgroups support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Il giorno 25/apr/2016, alle ore 22:30, Paolo <paolo.valente@xxxxxxxxxx> ha scritto:

> Il 25/04/2016 21:24, Tejun Heo ha scritto:
>> Hello, Paolo.
>> 
> 
> Hi
> 
>> On Sat, Apr 23, 2016 at 09:07:47AM +0200, Paolo Valente wrote:
>>> There is certainly something I don’t know here, because I don’t
>>> understand why there is also a workqueue containing root-group I/O
>>> all the time, if the only process doing I/O belongs to a different
>>> (sub)group.
>> 
>> Hmmm... maybe metadata updates?
>> 
> 
> That's what I thought in the first place. But one half or one third of
> the IOs sounded too much for metadata (the percentage varies over time
> during the test). And root-group IOs are apparently large. Here is an
> excerpt from the output of
> 
> grep -B 1 insert_request trace
> 
>    kworker/u8:4-116   [002] d...   124.349971:   8,0    I   W 3903488 + 1024 [kworker/u8:4]
>    kworker/u8:4-116   [002] d...   124.349978:   8,0    m   N cfq409A  / insert_request
> --
>    kworker/u8:4-116   [002] d...   124.350770:   8,0    I   W 3904512 + 1200 [kworker/u8:4]
>    kworker/u8:4-116   [002] d...   124.350780:   8,0    m   N cfq96A /seq_write insert_request
> --
>    kworker/u8:4-116   [002] d...   124.363911:   8,0    I   W 3905712 + 1888 [kworker/u8:4]
>    kworker/u8:4-116   [002] d...   124.363916:   8,0    m   N cfq409A  / insert_request
> --
>    kworker/u8:4-116   [002] d...   124.364467:   8,0    I   W 3907600 + 352 [kworker/u8:4]
>    kworker/u8:4-116   [002] d...   124.364474:   8,0    m   N cfq96A /seq_write insert_request
> --
>    kworker/u8:4-116   [002] d...   124.369435:   8,0    I   W 3907952 + 1680 [kworker/u8:4]
>    kworker/u8:4-116   [002] d...   124.369439:   8,0    m   N cfq96A /seq_write insert_request
> --
>    kworker/u8:4-116   [002] d...   124.369441:   8,0    I   W 3909632 + 560 [kworker/u8:4]
>    kworker/u8:4-116   [002] d...   124.369442:   8,0    m   N cfq96A /seq_write insert_request
> --
>    kworker/u8:4-116   [002] d...   124.373299:   8,0    I   W 3910192 + 1760 [kworker/u8:4]
>    kworker/u8:4-116   [002] d...   124.373301:   8,0    m   N cfq409A  / insert_request
> --
>    kworker/u8:4-116   [002] d...   124.373519:   8,0    I   W 3911952 + 480 [kworker/u8:4]
>    kworker/u8:4-116   [002] d...   124.373522:   8,0    m   N cfq96A /seq_write insert_request
> --
>    kworker/u8:4-116   [002] d...   124.381936:   8,0    I   W 3912432 + 1728 [kworker/u8:4]
>    kworker/u8:4-116   [002] d...   124.381937:   8,0    m   N cfq409A / insert_request
> 
> 
>>> Anyway, if this is expected, then there is no reason to bother you
>>> further on it. In contrast, the actual problem I see is the
>>> following. If one third or half of the bios belong to a different
>>> group than the writer that one wants to isolate, then, whatever
>>> weight is assigned to the writer group, we will never be able to let
>>> the writer get the desired share of the time (or of the bandwidth
>>> with bfq and all quasi-sequential workloads). For instance, in the
>>> scenario that you told me to try, the writer will never get 50% of
>>> the time, with any scheduler. Am I missing something also on this?
>> 
>> While a worker may jump across different cgroups, the IOs are still
>> coming from somewhere and if the only IO generator on the machine is
>> the test dd, the bios from that cgroup should dominate the IOs.  I
>> think it'd be helpful to investigate who's issuing the root cgroup
>> IOs.
>> 
> 

I can now confirm that, because of a little bug, a fraction ranging
from one third to half of the writeback bios for the writer is wrongly
associated with the root group. I'm sending a bugfix.

I'm retesting BFQ after this blk fix. If I understand correctly, now
you agree that BFQ is well suited for cgroups too, at least in
principle. So I will apply all your suggestions and corrections, and
submit a fresh patchset.

Thanks,
Paolo

> Ok (if there is some quick way to get this information without
> instrumenting the code, then any suggestion or pointer is welcome).
> 
> Thanks,
> Paolo
> 
>> Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux