Hello, Bart!
On 12/06/2017 12:46 AM, Bart Van Assche wrote:
On Wed, 2017-12-06 at 00:17 +0300, ptikhomirov wrote:
I mean threads in scsi_dec_host_busy() the part under rcu_read_lock are
divided into two groups: a) finished before call_rcu, b) beginning rcu
section after call_rcu. So first, in scsi_eh_inc_host_failed() we will
see all changes to host busy from group (a), second, all threads in group
(b) will see our change to host_failed. Either there is nobody in (b) and
we will start EH, or the thread from (b) which entered spin_lock last will
start EH.
In your case tasks from b does not see host_failed was incremented, and
will not start EH.
Hello Pavel,
What does "your case" mean? In my previous e-mail I explained a scenario that
cannot happen so it's not clear to me what "your case" refers to?
By "your case" I meant how your code works, especially that host_failed
increment is inside scsi_eh_inc_host_failed() in your patch.
Additionally, it seems like you are assuming that RCU guarantees ordering of
RCU read-locked sections against call_rcu()?
Sorry.. Not "call_rcu" itself, In my previous reply I meant the delayed
callback which actually triggers. (s/call_rcu/call_rcu's callback start/)
That's not how RCU works. RCU
guarantees serialization of read-locked sections against grace periods. The
function scsi_eh_inc_host_failed() is invoked through call_rcu() and hence
will be called during a grace period.
May be I missunderstand something, but I think that callback from
call_rcu is guaranteed to _start_ after a grace period, but not to end
before a next grace period. Other threads can enter rcu_read_lock
protected critical regions while we are still in a callback and will run
concurrently with callback.
Anyway, the different scenarios I see are as follows:
(a) scsi_dec_host_busy() finishes before scsi_eh_inc_host_failed() starts.
(b) scsi_dec_host_busy() starts after scsi_eh_inc_host_failed() has
finished.
So I think in (b) scsi_dec_host_busy starts after
scsi_eh_inc_host_failed has _started_.
In case (a) scsi_eh_inc_host_failed() will wake up the error handler. And in
case (b) scsi_dec_host_busy() will wake up the error handler. So it's not
clear to me why you think that there is a scenario in which the EH won't be
woken up?
So in case (b), in my understanding, scsi_dec_host_busy can skip wakeups
as it does not see host_failed change yet.
Bart.
To prove my point, some parts of kernel docs:
"This function invokes func(head) after a grace period has elapsed."
Documentation/RCU/whatisRCU.txt
"
15. The whole point of call_rcu(), synchronize_rcu(), and friends
is to wait until all pre-existing readers have finished before
carrying out some otherwise-destructive operation.
...
Because these primitives only wait for pre-existing readers, it
is the caller's responsibility to guarantee that any subsequent
readers will execute safely.
"
Documentation/RCU/checklist.txt
There is nothing about "callback ends before next grace period".
Sorry, if I'm missing something.
Pavel.