On Wed, 2007-10-31 at 11:31 -0500, James Bottomley wrote: > On Wed, 2007-10-31 at 17:24 +0100, Kay Sievers wrote: > > On Wed, 2007-10-31 at 11:13 -0500, James Bottomley wrote: > > > On Wed, 2007-10-31 at 12:04 -0400, Alan Stern wrote: > > > > On Wed, 31 Oct 2007, James Bottomley wrote: > > > > > > > > > > Yes, the queue is a child of the disk. > > > > > > > > > > Right, so this goes gendisk->queue (-> meaning parent of, or takes > > > > > reference to) > > > > > > > > No, no! The _child_ takes an implicit reference to the _parent_, not > > > > the other way around. > > > > > > > > > > > The scsi_device has a ref to the queue > > > > > > > > > > > > Yeah, while the queue is a grandchild of the scsi_device with the > > > > > > unified sysfs layout. > > > > > > > > > > No, the scsi_device is a direct parent of the queue, so we have > > > > > > > > > > scsi_device->queue > > > > > > > > Wrong -- the gendisk is the direct parent of the queue. The relevant > > > > line is in ll_rw_blk.c:blk_register_queue(): > > > > > > > > q->kobj.parent = kobject_get(&disk->dev.kobj); > > > > > > > > > > Yes, sounds right. We need to break that deleted-but-wait-for-cleanup at > > > > > > least at one of the devices involved. > > > > > > > > > > But it's broken when the driver is unbound. Diagrammatically it's: > > > > > > > > > > scsi_disk -> scsi_device -> queue > > > > > -> gendisk -> > > > > > > > > > > It's not circular, it's released when scsi_disk is released. It can > > > > > become circular if there's some hidden dependency between any of the > > > > > components ... but I don't think there is. > > > > > > > > Forget about the scsi_disk. It isn't part of the problem. Just > > > > concentrate on the scsi_device, the gendisk, and the queue. We have: > > > > > > > > scsi_device <- gendisk <- queue <- scsi_device, > > > > > > OK, so where does the gendisk get a reference to the scsi device? > > > > In the unified sysfs layout where the silly and conceptual broken idea > > of "class devices" gets removed. > > Everything that has a "device" link today will just live below the > > device the "device" link points to. The whole current kernel is already > > converted to do this, besides the "raw kobject" gendisk's, and the SCSI > > subsystem. The gendisk patch is queued in Greg's tree (see subject of > > this mail), and the conversion from "struct class_device" to "struct > > device" for the whole SCSI directory is coming soon. > > > > With the gendisk pointing to "driverfs_dev" ("device" link) it will > > become a child of the scsi_device. > > OK, light beginning to go on now. > > The problem is that you've fallen into the conceptual trap we tried very > hard to avoid in the initial go around of joining SCSI upper layer > drivers to gendisks. That's why no gendisk references are held by the > mid-layer, only by the entities that represent the objects created by > upper layer drivers. That will not change, only the disk will reference the device which it points to. It's not a problem, we can "orphan" the disk on delete, or we do the "orphaning" for all devices in the core, which is probably the right fix anyway. > Doesn't this circularity now exist for everything? Every device that > creates a queue has a reference to the queue, every queue has a > reference to its attached gendisk and now every gendisk has a reference > to the device creating the queue? This doesn't look to be a SCSI > specific problem. It's only SCSI so far, everything else seems fine. But, the real problem is that the core seems to deadlock if two devices reference each other (or build a larger circle), even when they are deleted, that's the problem we are running in. Thanks, Kay - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html