Hello, On Wed, Sep 07, 2011 at 01:09:10PM +0100, Bruce Stenning wrote: > Sorry for sending so many emails yesterday; I blame the dental anaesthetic > I received in the morning for being so jumpy on the send button ;-) Oh the fun. :) > I can certainly try this. Could you confirm whether my thoughts about a race > between the scsi_eh thread and the wake-up are plausible? I backtracked > yesterday because I thought the scsi_eh thread would get rescheduled naturally, > not realising that when the task state is TASK_INTERRUPTIBLE schedule() takes > the task off the run queue (so it needs to be explicitly woken.) > > Here is my thinking again: > > shost->host_eh_scheduled is read here in scsi_error_handler: > > set_current_state(TASK_INTERRUPTIBLE); > while (!kthread_should_stop()) { > if ((shost->host_failed == 0 && shost->host_eh_scheduled == 0) || > > There's no locking in scsi_error_handler (though functions it calls probably > claim locks.) > > When scheduling an EH, scsi_schedule_eh takes the shost->host_lock, increments > shost->host_eh_scheduled, and then wakes the EH thread. If this happens > between the scsi_eh thread reading host_eh_scheduled and sending itself back > to sleep (when the scsi_eh thread's state is TASK_INTERRUPTIBLE) nothing will > wake up the thread again and host_eh_scheduled will not get inspected. > host_eh_scheduled is stuck at 1 with the scsi_eh thread asleep, and it won't > get woken again because the ata port has been frozen and irqs are masked off. I don't think there's a race condition there. set_current_state() implies memory barrier and wake_up_process() implies wmb(). host_eh either sees the inrecremented eh_scheduled count or TASK_RUNNING set by wake_up_process(), so it can't miss an event. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html