On Wed, 10 Dec 2003, Kumba wrote: > Geert Uytterhoeven wrote: > > This `tulip halting', is it `transmit timed out', following by the chip being > > thrown in 10-base2 mode and not recovering until ifconfig down/up? > > > > I see that one on my PPC box, and I do have a fix. It's not perfect, but the > > box now recovers within 3 minutes, instead of needing manual intervention. > > To be honest, I'm not sure what's actually occuring. At first I thought > it was simply halting, but it does not appear to halt completely. Data > will still trickle in *very* slowly. If ping wouldn't time out after a > few seconds, I would bet the box would respond after about 3 minutes. > restarting the config does reset it back. That's different from what I'm seeing. My box doesn't respond at all. > Now that you mention mode switching, however, May fit in with some data > I gleaned using mii-diag that I spoke of in #mipslinux awhile back. > When the tulip driver was working fine, mii-diag reported this: > > MII PHY #1 transceiver registers: > 1000 782d 7810 0003 01e1 45e1 0001 0000 > 0000 0000 0000 0000 0000 0000 0000 0000 > 0000 0000 4000 0000 3ffb 0010 0000 0002 > 0001 0000 0000 0000 0000 0000 0000 0000 > > > Notice the setting of the 21st register (3rd row, 5th value). When the > tulip driver started acting up, that value changed to this: > > MII PHY #1 transceiver registers: > 1000 782d 7810 0003 01e1 45e1 0001 0000 > 0000 0000 0000 0000 0000 0000 0000 0000 > 0000 0000 4000 0000 38c8 0010 0000 0002 > 0001 0000 0000 0000 0000 0000 0000 0000 > > I didn't do very detailed searching for the meaning of the registers, > and never found out what the 21st register's specific purpose was, but > is this the mode switching you're mentioning perhaps? I don't know what these registers mean, but tulip_select_media() doesn't seem to affect the 21st register directly. Perhaps as a hardware side effect? > If so, I'll give your patch a run, see if it works and if the recovery > time can be shortened, or help to isolate the problem so it can be nailed. Here it is. I've been using it on 2.4.21 for more than 6 months. I upgraded to 2.4.23 9 days ago, and so far I haven't seen any of the printk()s, though. Without the patch, the driver switches from 10-baseT to 10-base2 unconditionally if the problem happens. With the patch, the switch is performed only if there's no 10-baseT link beat, and the driver recovers after a few minutes. This may still cause an annoying hick up, but the network (incl. open TCP connections) recovers. I have a 21041, using 10-baseT on a 10 Mbit hub. What Tulip does the Cube have? --- linux-2.4.23/drivers/net/tulip/tulip_core.c.orig Fri Nov 28 21:04:35 2003 +++ linux-2.4.23/drivers/net/tulip/tulip_core.c Sun Nov 30 11:37:45 2003 @@ -580,7 +580,15 @@ } else dev->if_port = 0; else + { +printk("tulip: old driver would switch to 10base2, "); + if (dev->if_port != 0 || (csr12 & 0x0004) != 0) { +printk("and we do\n"); dev->if_port = 1; + } else { +printk("but we don't\n"); + } + } tulip_select_media(dev, 0); } } else if (tp->chip_id == DC21140 || tp->chip_id == DC21142 Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds