On Sat, Nov 15, 2008 at 6:38 PM, Dan McGee <dpmcgee@xxxxxxxxx> wrote: > On Sat, Nov 15, 2008 at 8:11 PM, Dan McGee <dpmcgee@xxxxxxxxx> wrote: >> On Fri, Nov 14, 2008 at 11:02 AM, Bob Copeland <me@xxxxxxxxxxxxxxx> wrote: >>> On Fri, Nov 14, 2008 at 1:17 AM, Luis R. Rodriguez <mcgrof@xxxxxxxxx> wrote: >>>> If our offsets are the same then its probably on line 791: >>> [...] >>>> 790 name = wiphy_dev(local->hw.wiphy)->driver->name; >>>> 791 local->hw.workqueue = create_freezeable_workqueue(name); >>> >>> I agree, having looked at the objdump output. Hmm, maybe ->driver pointer >>> is bad even though I can't see that happening. Dan, can you try adding a >>> printk before line 790 to see if any of the pointers are null? >> >> So I went back and added a few things to the original unpatched code >> to see what was NULL pointering, just to be sure we were thinking >> right. Here is the relevant code: >> printk(KERN_DEBUG "wiphy_dev() : %p\n", wiphy_dev(local->hw.wiphy)); >> printk(KERN_DEBUG "driver : %p\n", >> wiphy_dev(local->hw.wiphy)->driver); >> printk(KERN_DEBUG "driver->name: %p\n", >> wiphy_dev(local->hw.wiphy)->driver->name); >> name = wiphy_dev(local->hw.wiphy)->driver->name; >> local->hw.workqueue = create_freezeable_workqueue(name); >> >> And the dmesg output: >> ath5k_pci xxx: registered as '' >> wiphy_dev() : b730b408 >> driver : 00000001 >> BUG: unalbe to handle kernel NULL pointer dereference at 00000001 >> >> So we bugged out on trying to print driver->name, which is the same >> problem we would have hit in the 'name =' line. > > I should clarify here- the real bug was when trying to access > '->driver', as we got the 00000001 poison pointer returned (this is a > poison value, right?). Not sure why its 00000001, nor do I know if its poison. One thing I am fairly positive about is that the reason why this was wrong all along was because we were trying to get the device's ->driver structure to get driver->name but the device won't get its ->driver pointer assigned until *after* a successful probe. Lets review the PCI probe: /** * __pci_device_probe() * @drv: driver to call to check if it wants the PCI device * @pci_dev: PCI device being probed * * returns 0 on success, else error. * side-effect: pci_dev->driver is set to drv when drv claims pci_dev. */ static int __pci_device_probe(struct pci_driver *drv, struct pci_dev *pci_dev) { const struct pci_device_id *id; int error = 0; if (!pci_dev->driver && drv->probe) { error = -ENODEV; id = pci_match_device(drv, pci_dev); if (id) error = pci_call_probe(drv, pci_dev, id); if (error >= 0) { pci_dev->driver = drv; error = 0; } } return error; } So unless probe was successful (pci_call_probe which calls drv->probe()) we don't update pci_dev->driver pointer. > The above sequence of events was what took place when trying to load > the module on startup. To see if other things had an effect, I > disabled module autoloading during the boot sequence and got slightly > different results but it looks to be the same type of problem: > > registered as '' > wiphy_dev: b730d740 > driver: 7fffffff > driver->name: ffffffff > BUG: unable to handle kernel paging request at ffffffff > > One more note- booting with the 2.6.27.6 shipped wireless modules > (mac80211 and ath5k) has always been working fine. It is only when I > try to run compat-wireless on top of this kernel that we are seeing > issues. This is interesting, but then again the fact that it was working *all along* for other devices is interesting too as it shouldn't have. > Theoretically that means this should be bisectable if we > really can't figure it out, but I'm not sure how practical that is. Yeah don't bother, the issue on this e-mail was fixed, another issue has come up though so that is separate. Luis -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html