Re: 2.6.19 tg3 Broadcom 5704 problems/questions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Brecht,

On Wed, 28 Mar 2007, Brecht Vermeulen wrote:

we are running multiple systems with same motherboard and NICs and get
the same problems under heavy load, with e.g. rsync and network block
device. I debugged already the hell out of it with all options of the
NICs (offloading on/off, ASF, jumbo frames and normal frames, ...), 32
bit/64 bit but could always get network lockups, sometimes only after 4
hours of heavy load. I got e.g. also MCE errors sometimes, but also
machines without those errors got the locks.

Sorry to hear you're in the same boat.

We're still having the problem, but haven't had a chance recently to take another look. I'm hoping to before the end of this week.

Were you able to notice any difference between having ASF enabled vs. ASF disabled? We noticed that the driver could reset the adapter with ASF disabled (I don't know how consistantly this could happen), but seemed to NOT be able to reset with ASF enabled.

We're also having trouble with MCE's on other systems (bad memory), after which our compute nodes (also H8SSL-i's) start spraying invalid crap onto the network (after crash, attach another system w/ crossover cable and watch from another machine, byte counters increases, packet counters do not).

So, I guess there is something wrong with that motherboard (not sure if
it's only the NICs, only the motherboard, or the combination of both).

I'll bring you into a conversation I'm having with someone from SuperMicro in another email thread.

For one of our production servers, we've put a 32 bit intel nic in a PCI
slot and it is stable now (although 1Gb/s is out of sight :-( ).

We're trying to avoid having to do this.

I'll send the other email shortly.

Thanks!
Paul



Paul Armor wrote:
Hi,

On Tue, 13 Mar 2007, Neil Horman wrote:
I'll summarize what our problems and config's are.

Problems - lockups on ethernet controllers under heavy NFS loads
         (sometimes driver can/will reset, sometimes not)
       systems completely lock up
Hardware - Supermicro H8SSL-i with onboard Broadcom 5704's (both clients
         and servers)
Server config - 2.6.19 kernel (thus tg3 ver 3.69)
       nfs-utils-1.0.7-13 FC4
       NIC running at 4500 MTU
What on earth is that?  I assume you are configured for jumbo frames
through your whole network, but why not bump your mtu all the way up
to 9000 then?

yes, we're configured to allow upto 9000 MTU, but we're using 4500 as
that was the intersection of performance with regards to switch topology
(don't ask), cpu overhead with the tg3 driver (in 2.6.11, at least), and
throughput (using a variety of canned benchmarky things).

Does the problem persist if you only use a 1500 byte MTU?

Don't know, we're theoretically in production mode (when the machines
are all up that the same time).

Failure caused by users building software in automounted FS's.
Can you get a sysrq-t when the system locks up?

Will try the next time it craps out, and I can still get console access.

Thanks,
Paul

-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+ UWM-LSC Group Systems Administrator        parmor@xxxxxxxxxxxxxxxxxxxx +
+ Physics 462                                                            +
+ U. of W. - Milwaukee                                                   +
+ PO Box 413                                                414-229-2677 +
+ Milwaukee, WI 53201                                   fax 414-229-5589 +
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux 802.1Q VLAN]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Git]     [Bugtraq]     [Yosemite News and Information]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux PCI]     [Linux Admin]     [Samba]

  Powered by Linux