Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2007-10-31 at 17:42 +0100, Kay Sievers wrote:
> On Wed, 2007-10-31 at 11:31 -0500, James Bottomley wrote:
> > On Wed, 2007-10-31 at 17:24 +0100, Kay Sievers wrote:
> > > On Wed, 2007-10-31 at 11:13 -0500, James Bottomley wrote:
> > > > On Wed, 2007-10-31 at 12:04 -0400, Alan Stern wrote:
> > > > > On Wed, 31 Oct 2007, James Bottomley wrote:
> > > > > 
> > > > > > > Yes, the queue is a child of the disk.
> > > > > > 
> > > > > > Right, so this goes gendisk->queue (-> meaning parent of, or takes
> > > > > > reference to)
> > > > > 
> > > > > No, no!  The _child_ takes an implicit reference to the _parent_, not 
> > > > > the other way around.
> > > > > 
> > > > > > > > The scsi_device has a ref to the queue
> > > > > > > 
> > > > > > > Yeah, while the queue is a grandchild of the scsi_device with the
> > > > > > > unified sysfs layout.
> > > > > > 
> > > > > > No, the scsi_device is a direct parent of the queue, so we have
> > > > > > 
> > > > > > scsi_device->queue
> > > > > 
> > > > > Wrong -- the gendisk is the direct parent of the queue.  The relevant 
> > > > > line is in ll_rw_blk.c:blk_register_queue():
> > > > > 
> > > > > 	q->kobj.parent = kobject_get(&disk->dev.kobj);
> > > > > 
> > > > > > > Yes, sounds right. We need to break that deleted-but-wait-for-cleanup at
> > > > > > > least at one of the devices involved.
> > > > > > 
> > > > > > But it's broken when the driver is unbound.  Diagrammatically it's:
> > > > > > 
> > > > > > scsi_disk -> scsi_device -> queue
> > > > > >           -> gendisk     ->
> > > > > > 
> > > > > > It's not circular, it's released when scsi_disk is released.  It can
> > > > > > become circular if there's some hidden dependency between any of the
> > > > > > components ... but I don't think there is.
> > > > > 
> > > > > Forget about the scsi_disk.  It isn't part of the problem.  Just 
> > > > > concentrate on the scsi_device, the gendisk, and the queue.  We have:
> > > > > 
> > > > > 	scsi_device <- gendisk <- queue <- scsi_device,
> > > > 
> > > > OK, so where does the gendisk get a reference to the scsi device?
> > > 
> > > In the unified sysfs layout where the silly and conceptual broken idea
> > > of "class devices" gets removed.
> > > Everything that has a "device" link today will just live below the
> > > device the "device" link points to. The whole current kernel is already
> > > converted to do this, besides the "raw kobject" gendisk's, and the SCSI
> > > subsystem. The gendisk patch is queued in Greg's tree (see subject of
> > > this mail), and the conversion from "struct class_device" to "struct
> > > device" for the whole SCSI directory is coming soon.
> > > 
> > > With the gendisk pointing to "driverfs_dev" ("device" link) it will
> > > become a child of the scsi_device.
> > 
> > OK, light beginning to go on now.
> > 
> > The problem is that you've fallen into the conceptual trap we tried very
> > hard to avoid in the initial go around of joining SCSI upper layer
> > drivers to gendisks.  That's why no gendisk references are held by the
> > mid-layer, only by the entities that represent the objects created by
> > upper layer drivers.
> 
> That will not change, only the disk will reference the device which it
> points to. It's not a problem, we can "orphan" the disk on delete, or we
> do the "orphaning" for all devices in the core, which is probably the
> right fix anyway.

But you're assuming the relationship gendisk->queue exists when we
delete the scsi device.  What should happen is that the gendisk should
be released first in the ULD detach.  Queues are allowed to exist
without gendisks (when a SCSI queue is first created, that's what it
should look like), so I think we're perhaps simply missing an API that
would detach the queue and the gendisk (and get the gendisk to delete
itself and give up its device link) ... doesn't del_gendisk() do this?

> > Doesn't this circularity now exist for everything?  Every device that
> > creates a queue has a reference to the queue, every queue has a
> > reference to its attached gendisk and now every gendisk has a reference
> > to the device creating the queue?  This doesn't look to be a SCSI
> > specific problem.
> 
> It's only SCSI so far, everything else seems fine.

But you didn't respond to the stated problem: there are very few other
non-scsi block devices ... have you tried looking at cciss?

> But, the real problem is that the core seems to deadlock if two devices
> reference each other (or build a larger circle), even when they are
> deleted, that's the problem we are running in.

Yes ... we sort of evaded that one by having class devices hang off the
sides of real devices ... making everything a peer with crosslinks
brings it back with a vengeance.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux