Re: serial port issues on IBM xseries with FC4 and High Availability heartbeat

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2006-05-26 at 09:27 -0400, Randy Grimshaw wrote:
> 
> I am trying to run a linux high availability cluster (failover pair)
> using serial as one of the heartbeats.
> 
> Due to numerous serial over-runs the systems are actually crashing
> periodically.
> 
> This is a very frustrating development for a system intended to provide
> HA. (certainly not ha ha ha).
> 
> I have updated to the latest bios.
> I have checked RTS DTS XON XOFF etc.
> This is happening with the stock and custom kernels.
> This is happening on three pairs of servers.
> The serial ports are detected as:
>        Serial: 8250/16550 driver $Revision: 1.90 $ 32 ports, IRQ
> sharing enabled
>        serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> 
> 
> Any advice would be greatly appreciated.

The most common problem with overruns is running too high a baud rate.
Remember, 16550s only have a 16-byte buffer in them.  At 38,400 baud,
you'll fill that buffer in about 260 microseconds.  9600 baud will fill
the buffer in a tiny bit over 1 millisecond.  Flow control tries to
prevent overflows.

Without flow control and if the machine is busy, the interrupt from the
chip may not be serviced in time and you'll miss data because you've
filled the buffer.  Dropping the baud rate down should help, and make
sure you use hardware (RTS/CTS) flow control.  Remember that software
(XON/XOFF) flow control requires the CPU to watch the buffer and send an
XOFF when it gets full.  You're already overrunning the buffer...
software flow control won't help.

Heartbeat stuff between nodes in a cluster is NOT a place to try to
scrimp and save money!  NICs are relatively cheap after all, they have
much bigger buffers in them and they use DMA to transfer data to the
processor instead of one-byte-at-a-time over the I/O ports.  Frankly,
NICS are far more reliable--especially for something this critical.

----------------------------------------------------------------------
- Rick Stevens, Senior Systems Engineer     rstevens@xxxxxxxxxxxxxxx -
- VitalStream, Inc.                       http://www.vitalstream.com -
-                                                                    -
-         The world is coming to an end ... SAVE YOUR FILES!!!       -
----------------------------------------------------------------------

-- 
fedora-list mailing list
fedora-list@xxxxxxxxxx
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
[Index of Archives]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [Fedora Magazine]     [Fedora News]     [Fedora Summer Coding]     [Fedora Laptop]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Maintainers]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Legacy]     [Fedora Desktop]     [Fedora Fonts]     [ATA RAID]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [SSH]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Centos]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Tux]     [Yosemite News]     [Gnome Users]     [KDE Users]     [Fedora Art]     [Fedora Docs]     [Asterisk PBX]     [Fedora Sparc]     [Fedora Universal Network Connector]     [Libvirt Users]     [Fedora ARM]

  Powered by Linux