On 13-11-12 07:58 PM, Douglas Gilbert wrote:
After feedback on version 2 and a new report of a failure in the vicinity of sg_remove() [remove device] during a shutdown on a large machine, the locking has been revised again.
The shutdown problem in the vicinity of sg_remove() has been traced to the st driver and a patch to fix st has been sent to this list. So there are now no reported problems against this patch. Doug Gilbert
ChangeLog v3: - change Sg_device::exclude and detached (renamed to detaching) to atomic_t - introduce atomic_t Sg_device::open_cnt and use for open(O_EXCL) logic. Hence stop using list_empty(sfds) which decouples the open/release logic from sg_remove_device() and other post-release cleanup functions - use a mutex to stop races between sg_open() and sg_release() on the same device - reduce the use of driver wide sg_index_lock so now it only protects sg_index_idr (the device array) - expand cleanups requested by checkpatch.pl to the remaining code in the driver ChangeLog v2: - favour non O_EXCL open()s over open(dev, O_EXCL)s - wake all open(dev)s if dev is removed (detached) - wake all read(dev_fd)s that are waiting for a response if dev is removed (detached) - other cleanups requested by checkpatch.pl ChangeLog v1: - introduce a finer grain (per device) lock to protect access and changes to the file descriptor objects - introduce a semaphore for mutual exclusion of co-incident open and release calls to the same device - improve the O_EXCL handling of sg_open() when multiple callers are waiting for an O_EXCL condition to clear - change some seq_printf()s to seq_puts()s as requested by checkpatch.pl - update copyright notice, version number and date The patch is against lk 3.12.0 (and should work on lk 3.10 and lk 3.11 as the sg driver hasn't changed). Testing is ongoing (see the v2 post) with focus on host removal and shutdown. The driver survives bombarding 4 LUs with queued requests spread across 6000 scsi_debug LUs. Some log noise is generated, but it is not from the sg driver: scsi 9:0:33:3: rejecting I/O to offline device scsi 9:0:33:3: [sg1000] killing request <multiple times> This is not seen when there are only 600 LUs. Signed-off-by: Douglas Gilbert <dgilbert@xxxxxxxxxxxx>
-- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html