Re: Serial console and interrupts latency.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Russell King - ARM Linux admin <linux@xxxxxxxxxxxxxxx>:

>
> On Fri, Mar 27, 2020 at 04:58:33PM +0300, Sergey Organov wrote:
> > Jiri Slaby <jslaby@xxxxxxx> writes:
> >
> > > On 24. 03. 20, 10:04, Sergey Organov wrote:
> > >> Hello,
> > >>
> > >> [Extended CC list to try to get some attention]
> > >>
> > >> I was investigating random serial overruns on my embedded board and
> > >> figured it strongly correlates with serial output (to another serial
> > >> port) from kernel printk() calls, that forced me to dig into the kernel
> > >> sources, and now I'm very confused.
> > >>
> > >> I'm reading drivers/tty/serial/8250/8250_port.c, and
> > >> serial8250_console_write() function in particular (being on tty-next
> > >> branch).
> > >>
> > >> What I see is that it locks interrupts
> > >>
> > >> 3141:              spin_lock_irqsave(&port->lock, flags);
> > >>
> > >> and then calls wait_for_xmitr() both indirectly here:
> > >>
> > >> 3159:      uart_console_write(port, s, count, serial8250_console_putchar);
> > >>
> > >> and then directly as well:
> > >>
> > >> 3165:      wait_for_xmitr(up, BOTH_EMPTY);
> > >>
> > >> before re-enabling interrupts at:
> > >>
> > >> 3179:              spin_unlock_irqrestore(&port->lock, flags);
> > >>
> > >> Now, wait_for_xmitr(), even according to comments, could busy-wait for
> > >> up to 10+1000 milliseconds, and in this case this huge delay will happen
> > >> at interrupts disabled?
> > >>
> > >> Does it mean any serial console output out of printk() could cause 10
> > >> milliseconds or even 1 second interrupts latency? Somehow I can't
> > >> believe it.
> > >>
> > >> What do I miss?
> > >
> > > 1 second _timeout_ is for flow-control-enabled consoles.
> >
> > Yeah, sure. So it does mean interrupts could be disabled for up to 1
> > second, on already up and running system. Too bad.
> >
> > Actually, I use 8250 only as a reference implementation, my actual
> > chip is handled by imx.c, and the latter even has no timeouts on this
> > path, so apparently may block (the entire kernel) indefinitely!
> >
> > > 10 ms is _timeout_ for a character. With slow 9600 baud console, sending
> > > one character takes 0.8 ms. With 115200, it is 70 us.
> >
> > 70us of disabled interrupts is a huge number, and for FIFO-enabled chips
> > the estimate should be multiplied by FIFO size (say, x16) that brings us
> > close to 1ms even on 115200, right?
> >
> > Anyway, it must cause receiving overruns on another port running at
> > higher or the same baud rate and no DMA, sooner or later, as it does
> > for me.
>
> So, don't use serial console then, it's unsuitable for your use case.

Well, I'd rather fix it, as serial console is otherwise very suitable
for my needs.

If nobody else interested, I'll simply disable the lock you've added
for non-smp builds
in my kernel version, rather than trying to fix the issue in general.

That said, finding generic solution would be an interesting quest.

>
> > > If you send one line (80 chars), it is really 66 and 5.5 ms, respectively.
> > >
> > > So yes, serial consoles can slow down the boot and add latency. Use
> > > faster speeds or faster devices for consoles, if you mind. And do not
> > > enable flow control. Serial is serial.
> >
> > I don't care about slowing-down boot. I care about huge interrupt
> > latency on up and running system, causing loss of characters  (overruns)
> > on other serial ports.
> >
> > To be sure, it is this code that works on already running system as
> > well, not only on boot-time, right? Or is my system somehow
> > misconfigured?
> >
> > I'm confused as this seems to be a major issue and nobody but me seems
> > to care or to suffer from it, and I can't figure why.
> >
> > For reference, I figured this locking was introduced by:
> >
> > commit d8a5a8d7cc32e4474326e0ecc1b959063490efc9
> > Author: Russell King <rmk@xxxxxxxxxxxxxxxxxxxxxxx>
> > Date:   Tue May 2 16:04:29 2006 +0100
> >
> >     [SERIAL] 8250: add locking to console write function
> >
> >     x86 SMP breaks as a result of the previous change, we have no real
> >     option other than to add locking to the 8250 console write function.
> >     If an oops is in progress, try to acquire the lock.  If we fail to
> >     do so, continue anyway.
>
> Correct, and what I said back then still applies - and more.

What bothers me is "we have no real option..." part of this, as it's rarely
happens to be the case.

>
> > It seems like I need to, and yeah, it'd be a somewhat tough task indeed,
> > but then there is one simple question: why isn't console output handled
> > through usual buffer/ISR paths?
>
> The "usual" paths may not be active, and, in the case of an oops, we
> want to see the output, which we wouldn't be able to if the oops
> occurred in interrupt context.

The oops part is already special-cased and could be left special-cased, one
way or another. What is important is not to keep interrupts disabled for long
during normal system operation.

If it could be achieved at all, it probably should then be implemented at the
upper level,  to save low-level drivers from these complexities.

Thanks,
-- Sergey



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux PPP]     [Linux FS]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Linmodem]     [Device Mapper]     [Linux Kernel for ARM]

  Powered by Linux