Re: kobj refcounting weirdness

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 09, 2009 at 12:36:54AM -0600, Alex Chiang wrote:
> Hi Kay, Greg,
> 
> I've been working on this patch series recently that adds
> function and device level hotplug into the PCI core:
> 
> 	http://thread.gmane.org/gmane.linux.kernel.pci/3495
> 
> For the last two weeks, I've been beating my head against a
> refcounting/kobject problem, and was hoping you could give me
> some advice, since I seem to have run into a wall.
> 
> My test case has been removing device 0000:04:00.0, which should
> remove all the devices below it.

You are removing the children before the parent device, right?  If not,
you have to be _very_ careful (personally, I don't think you should be
allowed to do that, but others, like the scsi developers, like doing
things like this...)

>  +-[0000:03]---00.0-[0000:04-07]----00.0-[0000:05-07]--+-02.0-[0000:06]--+-00.0  Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter
>  |                                                     |                 \-00.1  Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter
>  |                                                     \-04.0-[0000:07]--+-00.0  Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter
>  |                                                                       \-00.1  Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter
> 
> I can remove the device and rescan the bus once, and it works
> fine. The second removal works fine, and then, unpredictably,
> later rescan/remove cycles eventually end up producing a warning
> and oops every time. Sometimes I die on the 2nd rescan, sometimes
> not until the 4th or 5th remove/rescan cycle.

What is the warning and oops?

> In this data set, I turned on kobject debugging, and managed to
> capture a trace where we die on the 2nd rescan.
> 
> In this data set, we:
> 
> 	- create a kobject for 0000:04:00.0 (e00000018cac2920)
> 	- remove the device
> 	- observe '0000:04:00.0' (e00000018cac2920): calling ktype release
> 	- rescan the bus
> 	- discover that e00000018cac2920 is still hanging around!

What do you mean by "rescan"?  And sure, if you create a new device, it
could be allocated at the same location, that's what the slab allocators
do, right?

Can you provide the full debug log that shows the problem?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux