Hey, Neil. On Tue, Apr 05, 2011 at 01:46:29PM +1000, NeilBrown wrote: > After mddev_find returns the new mddev, md_open calls flush_workqueue > and as the work item to complete the delete has definitely been queued, it > should wait for that work item to complete. > > So the next time around the retry loop in __blkdev_get the old gendisk will > not be found.... > > Where is my logic wrong?? > > To put it another way matching your description Tejun, the put path has a > chance to run firstly while mddev_find is waiting for the spinlock, and then > while flush_workqueue is waiting for the rest of the put path to complete. I don't think the logic is wrong per-se. It's more likely that the implemented code doesn't really follow the model described by the logic. Probably the best way would be reproducing the problem and throwing in some diagnostic code to tell the sequence of events? If work is being queued first but it still ends up busy looping, that would be a bug in flush_workqueue(), but I think it's more likely that the restart condition somehow triggers in an unexpected way without the work item queued as expected. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html