On Tue, 24 Feb 2004 viro@parcelfarce.linux.theplanet.co.uk wrote: > On Mon, Feb 23, 2004 at 09:43:22PM -0500, Pavel Roskin wrote: > > Hello! > > > > Linux 2.6.3-bk5 (and perhaps older versions too) accesses uninitialized > > memory if register_netdev() fails in the dev->init call. I could > > reproduce the problem in the dummy driver. > > It's not register_netdev(); it's broken cleanup code in the driver. Agreed. > Fix in case of dummy.c is trivial - > diff -urN RC3-bk1/drivers/net/dummy.c RC3-bk1-current/drivers/net/dummy.c > --- RC3-bk1/drivers/net/dummy.c Wed Feb 18 13:40:43 2004 > +++ RC3-bk1-current/drivers/net/dummy.c Mon Feb 23 21:56:46 2004 > @@ -124,7 +124,7 @@ > dummies = kmalloc(numdummies * sizeof(void *), GFP_KERNEL); > if (!dummies) > return -ENOMEM; > - for (i = 0; i < numdummies && !err; i++) > + for (i = 0; !err && i < numdummies && !err; i++) > err = dummy_init_one(i); > if (err) { Add "i--;" here. The device that failed doesn't need another free_netdev(). > while (--i >= 0) > > Now, which driver have you actually seen it in? orinoco_plx. I've put the current snapshot here: http://www.red-bean.com/~proski/tmp/orinoco-oops.tar.gz orinoco_init() in orinoco.c has been modified to fail always. I could try to reduce the driver to another dummy that can be run on any system, but it will take time. # modprobe orinoco_plx orinoco.c 0.14alpha2HEAD (David Gibson <hermes@gibson.dropbear.id.au>, Pavel Roskin <proski@gnu.org>, et al) orinoco_plx.c 0.14alpha2HEAD (Daniel Barlow <dan@telent.net>, David Gibson <hermes@gibson.dropbear.id.au>) orinoco_plx: CIS: 01:03:00:00:FF:17:04:67:5A:08:FF:1D:05:01:67:5A: orinoco_plx: Local Interrupt already enabled Detected Orinoco/Prism2 PLX device at 0000:01:00.0 irq:12, io addr:0xc400 orinoco_plx: init_one(), FAIL! orinoco_plx: probe of 0000:01:00.0 failed with error -16 Unable to handle kernel paging request at virtual address 6b6b6b77 printing eip: c02d0d88 *pde = 00000000 Oops: 0000 [#1] CPU: 0 EIP: 0060:[<c02d0d88>] Not tainted EFLAGS: 00010202 EIP is at rtnetlink_fill_ifinfo+0x278/0x430 eax: 6b6b6b6b ebx: cf03e0c0 ecx: 00000640 edx: cd4e3920 esi: 00000000 edi: cd4e38a8 ebp: c12b9ed8 esp: c12b9eb4 ds: 007b es: 007b ss: 0068 Process events/0 (pid: 3, threadinfo=c12b8000 task=c12bcc80) Stack: c12b9ec8 00000640 00000f40 cd4e3000 cf05fde4 6b6b6b6b cf05fde4 00000000 00000010 c12b9efc c02d11b6 00000000 00000000 00000000 cf03e0c0 cf03e0c0 c12b9f20 c12b9f20 c12b9f34 c02d1a67 00000001 00000003 cf121340 5de4c35d Call Trace: [<c02d11b6>] rtmsg_ifinfo+0x46/0xb0 [<c02d1a67>] linkwatch_run_queue+0x147/0x1f0 [<c02d1b52>] linkwatch_event+0x42/0x70 [<c01363b4>] worker_thread+0x1f4/0x3e0 [<c0119f67>] recalc_task_prio+0x97/0x1c0 [<c02d1b10>] linkwatch_event+0x0/0x70 [<c011b6e0>] default_wake_function+0x0/0x10 [<c011b6e0>] default_wake_function+0x0/0x10 [<c013b105>] kthread+0x95/0xa0 [<c01361c0>] worker_thread+0x0/0x3e0 [<c013b070>] kthread+0x0/0xa0 [<c0107019>] kernel_thread_helper+0x5/0xc Code: 8b 50 0c b9 ff ff ff ff 31 c0 83 c2 08 89 d7 f2 ae f7 d1 49 Known facts: orinoco_plx_init_one() exists before the oops. rtnetlink_fill_ifinfo() crashes here: if (dev->qdisc_sleeping) RTA_PUT(skb, IFLA_QDISC, strlen(dev->qdisc_sleeping->ops->id) + 1, dev->qdisc_sleeping->ops->id); dev->qdisc_sleeping is 0x6b6b6b6b, which indicates freed memory (I have slab debugging enabled). Reordering statements in orinoco_plx_init_one() after "fail:" may prevent the oops, but only the first time. If the module is unloaded and loaded again, it crashes. Commenting out free_orinocodev() (wrapper around free_netdev()) fixes the oops, but I think it could leave some allocated memory. That's the likely workaround if we fail to fix the problem. -- Regards, Pavel Roskin - : send the line "unsubscribe linux-net" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html