Alan Stern [mailto:stern@xxxxxxxxxxxxxxxxxxx] writes: > On Wed, 3 Mar 2010, Hugh Daschbach wrote: > >> Alan Stern [mailto:stern@xxxxxxxxxxxxxxxxxxx] writes: >> >> > On Wed, 3 Mar 2010, Hugh Daschbach wrote: >> > >> >> > Can't we just protect the list? What is wanting to write to the list >> >> > while shutdown is happening? >> >> >> >> Indeed, Alan suggested holding the kset spinlock while iterating the ... >> > What I meant was that you should hold the spinlock while finding and >> > unlinking the last device on the list. Clearly you shouldn't hold it >> > while calling the device shutdown routine. >> >> I misunderstood. But I believe insertion and deletion is properly >> serliaized. It looks to me like the list structure is intact. It's the >> iterator that's been driven off into the weeds. ... >> Just to be clear, the list we're talking about is "list" in "struct >> kset" And the nodes of the list are chained by "entry" in "struct >> kobject". ... >> At a minimum the change looks something like the patch below. ... > If you really want to do this then you should remove the lock member > from struct kset. However this seems like an awful lot of work > compared to my original suggestion -- something like this (untested, > and you'll want to add comments): ... I'm not sure I do want to pursue this. It does seem particularly invasive at a fundamental level of a core data structure. Apparently I still don't understand your original suggestion. I'd prefer to, especially if it leads to a simpler fix. The loop in device_shutdown() looks something like: struct device *dev, *devn; list_for_each_entry_safe_reverse(dev, devn, &devices_kset->list, kobj.entry) { if (dev->bus && dev->bus->shutdown) { dev->bus->shutdown(dev); } else if (dev->driver && dev->driver->shutdown) { dev->driver->shutdown(dev); } } *dev gets delinked kobj_kset_leave() indirectly called from dev->*->shutdown(dev). This is protected by the spinlock. The secondary thread similarly calls kobj_kset_leave(). But when the secondary thread calls the shutdown routine for the device that devn points to, the loop hangs. Is there some way I can detect that devn no longer points to a valid device upon return from dev->*->shutdown(dev)? Or, where else can I look to better understand your suggestion? Thanks, Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html