How to really avoid serial buffer overruns for legacy ports?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi everyone,

I'm facing massive problems with serial port buffer overruns for the
following hardware/software constellation (and at least another one
with slightly changed hardware):

Kernel-2.6.23
(Setserial-2.17)

Serial ports:
4x RS232C NS16C550 (16550A) serial ports with 16 Byte FIFO

Chipsets:
Intel 855GME
Intel 6300ESB I/O Hub
Winbond W83627THF LPC Bus I/O Controller

The Intel 6300ESB I/O Hub is responsible for handling the first 2
ports and the other 2 are handled by the Winbond W83627THF.

IRQ assignment (via BIOS and setserial) is 4, 3, 10 and 11 (in order,
not shared).

I see overruns for every port (explained later), but one thing is
strange as I took a deeper look at the documentation of this Intel I/O
Hub, because they are not implementing 100% 16550-compatible serial
ports, e.g. the IER knows two additional bits 4 and 5 (usually always
set to 0 from the register definition of standard 16550). They provide
the following behavior, quote: "Receiver time out interrupt may be
configured to be separated from the receive data available interrupt
(using the bit5: COMP) to avoid interrupt controller and DMA
controller serving the receive FIFO at the same time.". Can anyone
tell me if this behavior causes general problems with the standard
linux driver? As far as I have seen, this shouldn't be an issue,
because the driver just looks at bit0 of IER to verify that an
interrupt is available, no matter which one, data available or
timeout. By the way, this is just additional information, because in
general all ports are somehow 'working' without seeing any problems
from outside.
Here the Intel 6300ESB I/O Hub data sheet if someone is interested
(page 671/672):
http://www.intel.com/design/intarch/datashts/30064103.pdf

The real problems:

I need stable serial port communication with external components
working at 115200 baud, so just decreasing the baudrate is not really
an issue here. The whole system also suffers already from quite a lot
of interrupts continuously thrown in every second from other devices.

I created a test scenario for my system using a null-modem cable, a
different machine working as sender unit and a simple asynchronous
application which creates, sends/receives and checks predefined 1KB
packages (primitive check through a XOR checksum at the end). With the
original configuration of the system this application is not able to
receive at least 100 packages without throwing the first checksum
error (we are always talking about 115200 baud of course). I could
verify that it really is an overrun by calling a simple printk
statement if the overrun flag is set for every serial8250-interrupt
that comes in.

Then I began to study a little bit about legacy serial ports and their
built-in problems. I read previous discussions about overruns (e.g.
http://lkml.org/lkml/2006/8/16/73) and tried almost everything which
is influenceable via parameters.

- Changing the FIFO trigger level (UART_FCR_R_TRIG_10) => no solution
I think it's also a bad idea because as mentioned above the system has
already enough interrupts to handle. Even if this would bring a slight
statistical improvement, it's far away from a solution.

- Change timer frequency + kernel preemptibility => no solution

Current configuration is:
UART_FCR_R_TRIG_10
CONFIG_PREEMPT_VOLUNTARY
CONFIG_HZ_300

- Using hardware (CRTSCTS) or sofware (Xon/Xoff) handshaking => no solution
Sadly this makes no difference because everything is controlled (at
least for these chipsets) by the software driver and handshaking is
not automatically handled by hardware, which would really prevent
buffer overruns, right?

- Create atomic interrupt handler in drivers/serial/8250.c
Before calling request_irq() in function serial_link_irq_chain() I do:
irq_flags |= IRQF_DISABLED;
Helped quite a lot (at least for my subjective view :)), but is also
not a final solution.

In my opinion, it's now all about decreasing the serial IRQ latency,
which is a result from this idiotic small 16 Byte FIFO buffer. I tried
to find a way increasing the general IRQ handling of all serial ports
by either implementing a general priority handling for IRQ's or to
change related APIC vectors (i'm using I/O APIC as PIC), both without
much success (at the moment). Is anyone an expert for these issues?

Here I found some (outdated) patches to add simple IRQ priorities:
http://users.informatik.uni-halle.de/~ladischc/linux_interrupt_priorities.html
This helped me to implement this little helper function for showing
the current interrupt priorities at boot time (in one line).

- arch/i386/kernel/io_apic.c
static inline void print_ioapic_priorities(void)
{
        int level, offset, i;

        printk(KERN_INFO "I/O-APIC interrupt priorities:");
        for (level = FIRST_DEVICE_VECTOR & 0xf0;
             level < FIRST_SYSTEM_VECTOR;
             level += 0x10) {
                for (offset = 0xf; offset >= 0; --offset) {
                        for (i = 0; i < NR_IRQS; ++i)
                                if (irq_vector[i] == level + offset)
                                        printk(" %d", i);
                }
        }
        printk("\n");
}

Output for my machine: 1 0 4 3 6 5 8 7 10 9 12 11 14 13 20 15
I don't believe this code is still correctly working (I have IRQ's
higher 20 in that system??), but according to this at least the first
2 serial ports (IRQ: 4 3) would already have a very good priority.

Next steps could be:
- removing every unnecessary interrupt handling code to improve performance
- trying realtime approaches, e.g. rt patches for kernel

Can someone help me with these issues I explained?
Maybe someone has similar problems?
Has anyone fixed some of these problems by decreasing IRQ latency?
Does anyone know if the rt patches make much difference?

By the way, I also tried an usb2serial converter (prolific chipset),
which solved every problem with overruns. I looked at the code and
found various 1K buffers being used for implementation. Can someone
verify that such "converters" usually work with much bigger hardware
buffers in compared to 16 Byte FIFO stuff? Also USB makes much of a
difference here, I think.

Any other information, advice or experience would be helpful and welcome.

Thanks in advance,
Andreas
-
To unsubscribe from this list: send the line "unsubscribe linux-serial" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux PPP]     [Linux FS]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Linmodem]     [Device Mapper]     [Linux Kernel for ARM]

  Powered by Linux