On 02/10/2015 07:04 AM, Nicolas Schichan wrote: > On 02/10/2015 12:34 AM, Peter Hurley wrote: >> Hi Nicolas, >> >> Thanks for the report. >> > [...] >>> When a caracter is received on the UART while the kernel is printing >>> the boot messages, as soon as the kernel configures the UART for >>> receiving (after root filesystem mount), it gets stuck printing the >>> following message repeatedly: >>> >>> serial8250: too much work for irq29 >>> >>> Once stuck, the reception of another character allows the boot process >>> to finish. >>> >>> From what I can gather, when we hit that, the UART_IIR_NO_INT is 0 (so the >>> interrupt is raised), but the UART_LSR_DR bit is 0 as well so the UART_RX >>> register is never read to clear the interrupt. >> >> The "too much work" message means serial8250_handle_irq() is returning 0, >> ie., not handled. Which in turn means IIR indicates no interrupt is pending >> (UART_IIR_NO_INT == 1). >> >> Can you log the register values for LSR and IIR at both patch locations >> in serial8250_do_startup()? >> >> (I can get you a debug patch, if necessary. Let me know) > > Hi Peter, > > Thanks for your reply. > > Here is what I have when the issue is triggered: > > [ 12.154877] lsr 0x60 / iir 0x01 > [ 12.158071] lsr 0x60 / iir 0x01 > [ 12.161438] serial8250: too much work for irq29 > [ 12.165982] lsr 0x60 / iir 0x0c > [ 12.169354] serial8250: too much work for irq29 > [ 12.173900] lsr 0x60 / iir 0x0c > (previous two messages are repeated and printk_ratelimited()) Thanks for this information; I see I was wrong about the cause of message. I think what happens during startup is that on this silicon clearing the rx fifo (by serial8250_clear_fifos()) clears data ready but not the rx timeout condition which causes a spurious rx interrupt when interrupts are enabled. So caught between two broken UARTs: one that underflows its rx fifo because of unsolicited rx reads and the other that generates spurious interrupt without unsolicited rx reads. > When the issue is not triggered: > > [ 10.784871] lsr 0x60 / iir 0x01 > [ 10.788066] lsr 0x60 / iir 0x01 > [ 10.794734] VFS: Mounted root (nfs filesystem) readonly on device 0:13. > [ 10.801654] devtmpfs: mounted > [ 10.805169] Freeing unused kernel memory: 184K (807be000 - 807ec000) > (userland takes over after that) > > I have also displayed the IIR and LSR registers when the "too much fork for > IRQ" condition is triggered. > > In the serial8250_do_startup(), before the interrupt are unmasked at the end, > the IIR looks sane and UART_IIR_NO_INT bit is set. When stuck > serial8250_interrupt(), UART_IIR_NO_INT is cleared and the interrupt ID is set > to 0xc which is not handled by the kernel at this time (the Kirkwood datasheet > indicates that it is some kind of timeout condition from what I can gather). Yes, IIR == UART_IIR_RX_TIMEOUT is to used indicate that data is in the rx fifo but has not reached the rx trigger level yet. ATM, I'm not exactly sure if there is a safe way to clear the spurious interrupt from the interrupt handler. I'm fairly certain the only way to clear the rx timeout interrupt is to read the rx fifo, but I think this would race with actual data arrival. IOW, there might not be a way to determine if the data read is spurious or not. Regards, Peter Hurley > Here is the corresponding debug patch, for reference (against a 3.19 kernel > this time): > > diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_ > index 11c6685..ed93741 100644 > --- a/drivers/tty/serial/8250/8250_core.c > +++ b/drivers/tty/serial/8250/8250_core.c > @@ -1661,6 +1661,12 @@ static int exar_handle_irq(struct uart_port *port) > return ret; > } > > +static inline void show_lsr_iir(struct uart_port *port) > +{ > + printk("lsr 0x%02x / iir 0x%02x\n", serial_port_in(port, UART_LSR), > + serial_port_in(port, UART_IIR)); > +} > + > /* > * This is the serial driver's interrupt routine. > * > @@ -1705,6 +1711,8 @@ static irqreturn_t serial8250_interrupt(int irq, void *dev > /* If we hit this, we're dead. */ > printk_ratelimited(KERN_ERR > "serial8250: too much work for irq%d\n", irq); > + if (printk_ratelimit()) > + show_lsr_iir(port); > break; > } > } while (l != end); > @@ -2107,6 +2115,7 @@ int serial8250_do_startup(struct uart_port *port) > /* > * Clear the interrupt registers. > */ > + show_lsr_iir(port); > if (serial_port_in(port, UART_LSR) & UART_LSR_DR) > serial_port_in(port, UART_RX); > serial_port_in(port, UART_IIR); > @@ -2269,6 +2278,7 @@ dont_test_tx_en: > * saved flags to avoid getting false values from polling > * routines or the previous session. > */ > + show_lsr_iir(port); > if (serial_port_in(port, UART_LSR) & UART_LSR_DR) > serial_port_in(port, UART_RX); > serial_port_in(port, UART_IIR); > > Regards, > -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html