> From: Robert Heller <heller@xxxxxxxxxxxx> > Organization: Deepwoods Software > Reply-To: CentOS mailing list <centos@xxxxxxxxxx> > Date: Tue, 19 May 2009 09:46:15 -0400 > To: CentOS mailing list <centos@xxxxxxxxxx> > Cc: <centos@xxxxxxxxxx> > Subject: Re: Weird CentOS 5.3 problem > > At Tue, 19 May 2009 09:04:43 -0400 CentOS mailing list <centos@xxxxxxxxxx> > wrote: > >> >> >> >> I reimaged a compute node on our cluster with the latest 5.3 updates (we >> were previously running 5.2), but we kept the kernel at 2.6.18-92.1.10.el5 >> until I can find time to rebuild some of our kernel modules. After the >> image install finishes and the system reboots, the eth0 ethernet interface >> disappears. If I do an ifconfig Âa, I see what should be eth0, but it¹s >> listed as __tmp2081258173. >> >> [root@node0770 ~]# ifconfig -a >> __tmp2081258173 Link encap:Ethernet HWaddr 00:1E:68:86:67:04 >> BROADCAST MULTICAST MTU:1500 Metric:1 >> RX packets:0 errors:0 dropped:0 overruns:0 frame:0 >> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 >> collisions:0 txqueuelen:1000 >> RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) >> Interrupt:66 >> >> The dmesg output isn¹t very helpful: >> >> [root@node0770 ~]# dmesg|grep eth0 >> eth0: forcedeth.c: subsystem: 0108e:534b bound to 0000:00:08.0 >> >> >> If I remove our lustre modules that were built for the 2.6.18-92.1.10.el5 >> kernel and reboot, the eth0 interface reappears. Another piece to this >> puzzle is that this problem only seems to happen on our Sun X2200¹s. Our >> Dell 1950¹s work just fine after putting on the 5.3 updates. Anyone know >> what could cause this behavior? > > Check /etc/modprobe.conf (and > /etc/sysconfig/network-scripts/if-cfg-eth0) -- if you are doing a > disk-to-disk backup type of install, the alias for eth0 is very likely > wrong (and the HW address in /etc/sysconfig/network-scripts/if-cfg-eth0 > is also wrong). You may have to manually update these two files on the > 'new' machine, since it likely has a different NIC, requiring a > different driver. It will also have a different MAC (HW) address as > well. In the old days, kudzu would detect this and pop up during the > boot process. > > What does lspci display? > We add the two lines at the end of modprobe.conf for lustre. alias eth0 tg3 alias eth1 tg3 alias eth2 forcedeth alias eth3 forcedeth alias scsi_hostadapter sata_nv options lnet networks="tcp0(eth0)" options ksocklnd enable_irq_affinity=0 The /etc/sysconfig/network-scripts/ifcfg-eth0 has the correct settings for this host. We actually generate this file during the post-install. Here's what it looks like: DEVICE=eth0 BOOTPROTO=none STARTMODE=onboot ONBOOT=yes USERCTL=no TYPE=Ethernet IPV6INIT=no IPADDR=192.168.3.91 BROADCAST=192.168.255.255 NETMASK=255.255.0.0 GATEWAY=192.168.100.1 Here's the lspci output: 00:00.0 RAM memory: nVidia Corporation MCP55 Memory Controller (rev a2) 00:01.0 ISA bridge: nVidia Corporation MCP55 LPC Bridge (rev a3) 00:01.1 SMBus: nVidia Corporation MCP55 SMBus (rev a3) 00:02.0 USB Controller: nVidia Corporation MCP55 USB Controller (rev a1) 00:02.1 USB Controller: nVidia Corporation MCP55 USB Controller (rev a2) 00:04.0 IDE interface: nVidia Corporation MCP55 IDE (rev a1) 00:05.0 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3) 00:06.0 PCI bridge: nVidia Corporation MCP55 PCI bridge (rev a2) 00:08.0 Bridge: nVidia Corporation MCP55 Ethernet (rev a3) 00:09.0 Bridge: nVidia Corporation MCP55 Ethernet (rev a3) 00:0a.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a3) 00:0b.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a3) 00:0c.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a3) 00:0d.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a3) 00:0f.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a3) 00:18.0 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] HyperTransport Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] Miscellaneous Control 00:18.4 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] Link Control 00:19.0 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] HyperTransport Configuration 00:19.1 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] Address Map 00:19.2 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] DRAM Controller 00:19.3 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] Miscellaneous Control 00:19.4 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] Link Control 01:05.0 VGA compatible controller: ASPEED Technology, Inc. AST2000 02:00.0 Ethernet controller: MYRICOM Inc. Myri-10G Dual-Protocol NIC 05:00.0 PCI bridge: Broadcom EPB PCI-Express to PCI-X Bridge (rev b5) 06:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5715 Gigabit Ethernet (rev a3) 06:04.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5715 Gigabit Ethernet (rev a3) We tried upgrading to the latest tg3 ethernet driver, but no change in the symptoms. -Randy _______________________________________________ CentOS mailing list CentOS@xxxxxxxxxx http://lists.centos.org/mailman/listinfo/centos