Re: [GIT PULL] Block fixes for 4.17-rc2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> Il giorno 27 apr 2018, alle ore 16:29, Jens Axboe <axboe@xxxxxxxxx> ha scritto:
> 
> On 4/27/18 1:57 AM, Paolo Valente wrote:
>> Should Jens' commit for bfq be still blocked by my concern, this is
> 
> The patch is in.
> 
>> just to inform you that I think I found a way to improve bfq, so that
>> it can handle also requests that are prepared, but then disappear
>> without any communication to bfq.  In particular, my change would
> 
> This is going about it in totally the wrong way. If things magically
> disappear "without any communication to bfq", then it's either a bug in
> the scheduling framework OR a bug in bfq. Those are the only two
> options. Request don't magically appear or disappear, they follow a
> specific set of rules.
> 

I meant "disappear for bfq", sorry if this created confusion.  To
clear further confusion, let me restate the consequences of what you
told me that happens, reusing your own words.

(1) "For the flush case, you start out with a regular request, and
assign the pointers." -> bfq assigns the pointers, for that regular
request, because bfq associates that request with an internal queue
(as it has to do).  In particular bfq increments reference counters
and other counters in that queue (as it has to do).

(2) "Then when we transform that into a flush, it's bypass inserted,"
-> The same request is then transformed into a request with no icq,
i.e., a request not associated with any queue. bfq has no way to know that this
bypass-inserted request is a transformed version of the same request
prepared one moment before, because no bfq hook is invoked in the
middle ( "...  the request has an end_io hook, we go through there and
don't finish the request").

So, the incrementing of counters done for of event (1) will *never* be
balanced, pushing bfq into an inconsistent state.  Unless, of course,
I misinterpreted your sentences above.


>> include Jens's change, so I will simply base my change on Jens' one,
>> and I remove my disagreement.  Of course, Jens' change alone would
>> just move the system from an immediate crash, to an unpredictable
>> behavior, which may include later crashes.
> 
> Can we please stop with this nonsense. A (->priv[0]) and B (->priv[1])
> are valid, IFF C (->elv.icq) is valid. The patch ensures we clear A and
> B, if C isn't valid. There's nothing unpredictable about the patch.

You got me wrong.  The sense is as follows: because of the above,
the transformation of the request leads bfq into an inconsistent
state, with which your patch has nothing to do.  Your patch opens the
door to the consequences of this inconsistency only because your patch
eliminates an immediate crash, on request dispatch, occurring *before*
those consequences could have effects.  Your patch *is not* the cause
of this inconsistency.

That said, once a complex system is inconsistent, its behavior may be
hard to predict.  For example, one of the problems we fixed some
months ago was the missing of the requeue hook in bfq, which caused a
very similar unbalance of counters, and ultimately a system hang.

I can even paste involved code snippets, if needed, but I guess it
won't be necessary.

Anyway, I'm already testing a patch that should solve the problem I
described again in this email.  I'm basing that patch on yours, if ok
for you.

Thanks,
Paolo

> 
> -- 
> Jens Axboe
> 





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux