On Tue, 2005-04-19 at 07:31 +0900, Tejun Heo wrote: > The original code also uses timer pending status as a signal that > command completed normally in scsi_eh_done() function, and the same > race also exists in the original code, no matter what we do, unless we > make timer expiration and removal of the command atomic, there will be > a window in which command completes normally but considered to have > timed out as long as we use timer pending status as tie breaker. True enough; it's a race between the driver calling scsi_done() and the timer expiring. However, that's an acceptable race, since the timer values are usually in the order of a few seconds and the command usually completes in milliseconds. the done function is called in interrupt context after command completion, so it's as close as possible to the actual command completion > The patch moves the test out of scsi_eh_done() into > scsi_send_eh_cmnd() and this does widen the window by delaying removal > of timer until after the original thread gets scheduled, but usually > not by much and that's how timers are done in many cases (through out > the kernel, timer removals are done with intervening scheduling and no > one considers those incorrect). So... The time between a thread being marked ready to run and actually running has been measured in seconds on a heavily loaded system. That makes the race window potentially as wide as the timer, which is unacceptable. James - : send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html