On 9/14/23 1:20 AM, Wenchao Hao wrote: > On 2023/9/1 17:41, Wenchao Hao wrote: >> It's unbearable for systems with large scale scsi devices share HBAs to >> block all devices' IOs when handle error commands, we need a new error >> handle mechanism to address this issue. >> >> I consulted about this issue a year ago, the discuss link can be found in >> refenence. Hannes replied about why we have to block the SCSI host >> then perform error recovery kindly. I think it's unnecessary to block >> SCSI host for all drivers and can try a small level recovery(LUN based for >> example) first to avoid block the SCSI host. >> >> The new error handle mechanism introduced in this patchset has been >> developed and tested with out self developed hardware since one year >> ago, now we want this mechanism can be used by more drivers. >> >> Drivers can decide if using the new error handle mechanism and how to >> handle error commands when scsi_device are scanned,the new mechanism >> makes SCSI error handle more flexible. >> >> SCSI error recovery strategy after blocking host's IO is mainly >> following steps: >> >> - LUN reset >> - Target reset >> - Bus reset >> - Host reset >> > > Mike gave some suggestions and I found a bug in fallback logic, I would > address these and resend in next few days. Please wait to resend. I'm still reviewing the patches. When I commented last time I just did a quick look over to get an idea for the design and what your goals were.