Re: [PATCH v3 1/2] Ensure that the SCSI error handler gets woken up

Pavel Tikhomirov <ptikhomirov@xxxxxxxxxxxxx> · Wed, 6 Dec 2017 01:49:49 +0300

Hello, Bart!

On 12/06/2017 12:46 AM, Bart Van Assche wrote:
On Wed, 2017-12-06 at 00:17 +0300, ptikhomirov wrote:
I mean threads in scsi_dec_host_busy() the part under rcu_read_lock are
divided into two groups: a) finished before call_rcu, b) beginning rcu
section after call_rcu. So first, in scsi_eh_inc_host_failed() we will
see all changes to host busy from group (a), second, all threads in group
(b) will see our change to host_failed. Either there is nobody in (b) and
we will start EH, or the thread from (b) which entered spin_lock last will
start EH.

In your case tasks from b does not see host_failed was incremented, and
will not start EH.

Hello Pavel,

What does "your case" mean? In my previous e-mail I explained a scenario that
cannot happen so it's not clear to me what "your case" refers to?

By "your case" I meant how your code works, especially that host_failed 
increment is inside scsi_eh_inc_host_failed() in your patch.

Additionally, it seems like you are assuming that RCU guarantees ordering of
RCU read-locked sections against call_rcu()?

Sorry.. Not "call_rcu" itself, In my previous reply I meant the delayed 
callback which actually triggers. (s/call_rcu/call_rcu's callback start/)

That's not how RCU works. RCU
guarantees serialization of read-locked sections against grace periods. The
function scsi_eh_inc_host_failed() is invoked through call_rcu() and hence
will be called during a grace period.

May be I missunderstand something, but I think that callback from 
call_rcu is guaranteed to _start_ after a grace period, but not to end 
before a next grace period. Other threads can enter rcu_read_lock 
protected critical regions while we are still in a callback and will run 
concurrently with callback.

Anyway, the different scenarios I see are as follows:
(a) scsi_dec_host_busy() finishes before scsi_eh_inc_host_failed() starts.
(b) scsi_dec_host_busy() starts after scsi_eh_inc_host_failed() has
     finished.

So I think in (b) scsi_dec_host_busy starts after 
scsi_eh_inc_host_failed has _started_.

In case (a) scsi_eh_inc_host_failed() will wake up the error handler. And in
case (b) scsi_dec_host_busy() will wake up the error handler. So it's not
clear to me why you think that there is a scenario in which the EH won't be
woken up?

So in case (b), in my understanding, scsi_dec_host_busy can skip wakeups 
as it does not see host_failed change yet.

Bart.

To prove my point, some parts of kernel docs:

"This function invokes func(head) after a grace period has elapsed." 
Documentation/RCU/whatisRCU.txt

"
15.     The whole point of call_rcu(), synchronize_rcu(), and friends
        is to wait until all pre-existing readers have finished before
        carrying out some otherwise-destructive operation.
...
        Because these primitives only wait for pre-existing readers, it
        is the caller's responsibility to guarantee that any subsequent
        readers will execute safely.
"
Documentation/RCU/checklist.txt

There is nothing about "callback ends before next grace period".

Sorry, if I'm missing something.

Pavel.