On Tuesday, 22 of April 2008, Rafael J. Wysocki wrote: > On Tuesday, 22 of April 2008, Linus Torvalds wrote: > > > > On Tue, 22 Apr 2008, Zdenek Kabelac wrote: > > > > > > Unsure how it is related to my orginal Oops post - but now when I've > > > debug pagealloc enabled this appeared in my log after resume - should > > > I open new bug for this - or could this be part of the problem I've > > > experienced later? > > > > > > (Note - now I'm running commit: 8a81f2738f10ca817c975cec893aa58497e873b2 > > > > > > sd 0:0:0:0: [sda] Starting disk > > > mmc0: new SD card at address 5a61 > > > mmc mmc0:5a61: parent mmc0 is sleeping, will not add > > > ------------[ cut here ]------------ > > > WARNING: at drivers/base/power/main.c:78 device_pm_add+0x6c/0xf0() > > > > This is unrelated to the other issue, I think. > > > > Your warning comes from commit 58aca23226a19983571bd3b65167521fc64f5869, > > which admittedly looks like total crap. > > Well, I'm sorry that you think so. > > > Rafael, what's the point of that commit? > > More or less as stated in the changelog. If we register a child of a sleeping > device, the child ends up on dpm_active before the parent, so the ordering will > be wrong during the next suspend. > > That was discussed on linux-pm, mainly with Alan Stern. > > > I read the commit message, but I can't make myself agree with the commit > > code itself. If it's a "checking that the order is correct" thing, it > > should be a warning, but not change the actual _action_ of the code. > > That is easy to change. Please find appended a patch for that. > > > Because the commit refused to add the device, it is also then the direct > > reason for the oops you get later, as far as I can tell: > > > > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000050 > > > IP: [klist_del+29/128] klist_del+0x1d/0x80 > > > PGD 0 > > > Oops: 0000 [1] PREEMPT SMP DEBUG_PAGEALLOC > > > CPU 0 > > > Call Trace: > > > [bus_remove_device+158/208] bus_remove_device+0x9e/0xd0 > > > [device_add+1358/1376] device_add+0x54e/0x560 > > There is a bug in device_add() that IMO can be fixed this way: > > Index: linux-2.6/drivers/base/core.c > =================================================================== > --- linux-2.6.orig/drivers/base/core.c > +++ linux-2.6/drivers/base/core.c > @@ -820,11 +820,11 @@ int device_add(struct device *dev) > error = bus_add_device(dev); > if (error) > goto BusError; > + bus_attach_device(dev); > error = device_pm_add(dev); > if (error) > goto PMError; > kobject_uevent(&dev->kobj, KOBJ_ADD); > - bus_attach_device(dev); > if (parent) > klist_add_tail(&dev->knode_parent, &parent->klist_children); > > The problem is that bus_remove_device() assumes bus_attach_device() to have > run, AFAICS. Hm, actually it's better to do this instead IMHO: --- Prevent bus_remove_device() from crashing if dev->knode_bus has not been initialized before it's called. Signed-off-by: Rafael J. Wysocki <rjw@xxxxxxx> --- drivers/base/bus.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) Index: linux-2.6/drivers/base/bus.c =================================================================== --- linux-2.6.orig/drivers/base/bus.c +++ linux-2.6/drivers/base/bus.c @@ -530,7 +530,8 @@ void bus_remove_device(struct device *de sysfs_remove_link(&dev->bus->p->devices_kset->kobj, dev->bus_id); device_remove_attrs(dev->bus, dev); - klist_del(&dev->knode_bus); + if (klist_node_attached(&dev->knode_bus)) + klist_del(&dev->knode_bus); pr_debug("bus: '%s': remove device %s\n", dev->bus->name, dev->bus_id); _______________________________________________ linux-pm mailing list linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/linux-pm