On Tue, 2013-07-02 at 10:58 -0400, Jörn Engel wrote: > On Tue, 2 July 2013 06:37:05 +0000, James Bottomley wrote: > > > > I don't understand what you're getting at. In a dual HBA situation, > > whether the second HBA is implicated or not depends on configuration and > > what the first HBA is doing. If it's just passively lost device state, > > then the second HBA should continue just fine. If the insane HBA is > > If the problem is an insane drive instead of an insane HBA, both HBAs > will be in roughly the same state at roughly the same time - assuming > they both send commands to the insane drive. If they now go into > error handling and effectively shut off all the sane drives at roughly > the same time, the user is ****ed. That's handled in device reset, so I don't understand your point. James > And we shouldn't require the user to buy better hardware. The whole > point of a redundant setup is that your plane doesn't crash to the > ground when one of your two engines fails. If regulations required > perfect engines, you wouldn't be flying to conferences. They require > decent engines and enough redundancy that any one can fail at any > moment. > > Computer systems are no different. We can construct a robust system > from individually less robust components. Requiring perfect > components would be ludicrous. Having a system design where one > faulty component will reliably bring the system down is equally > ludicrous. Sadly that is also the state of today's scsi stack. > > This is not a theoretical problem, btw. We currently carry some > patches to solve it for us. They are not applicable for mainline in > their current state - we support a lot less hardware diversity. But > trust me, we didn't create them on a whim. ;) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html