Bart, Thanks for reviewing. On 01/31/2018 05:06 PM, Bart Van Assche wrote:
Sorry but I think this patch introduces new race conditions. Have you
Can you detail the race conditions? As far as I can see, the only race condition would be when an error handler is invoked very close in time to the shutdown/unload handler, and as such miss the 'ioc->remove_host' assignment (thus it continues running anyway, and hit the oops). But this problem has never been reported, so I'd think that case would be rare. Not that it does not need to be perfectly solved, but _if_ that is the only problem with the patch (and _if_ I got that right), then it would still be better than the current code in oops. > considered to modify the mpt3sas driver such that interrupts are only > disabled after all commands have finished? Yes, I considered introducing waits in several points, but since the driver is tearing down, the only point which it seem to make sense to insert is right at the beginning of shutdown/unload -- but it seemed not sufficient: imagine not all commands finish, which would block that path, so we need a time out mechanism, but that does not guarantee the call back won't occur later on (say, a while after we gave up), so we would still need a guard at the callbacks. I also considered checking for the particular pointer which is set to NULL and cause the Oops, but that is a series of 10ish (small) patches which must check 'ioc->scsi_lookup' in a spinlock, so it seemed a bit bigger than what was needed.. and would still be prone to hitting an oops due to another pointer at a later time. So after thinking about those, I went with the simplest approach. I'd be happy to consider the race conditions you thought of, provided more detail. thanks, Mauricio