Hi Peter and Prarit, On 2015/07/08 23:00, Prarit Bhargava wrote: > > > On 07/08/2015 09:51 AM, Peter Hurley wrote: >> Hi Prarit, >> >> On 07/08/2015 08:53 AM, Prarit Bhargava wrote: >>> >>> >>> On 07/08/2015 07:55 AM, Peter Hurley wrote: >>>> Hi Taichi, >>>> >>>> On 06/05/2015 06:03 AM, Taichi Kageyama wrote: >>>>> This patch provides a new parameter as a workaround of the following >>>>> problem. It allows us to skip autoconfig_irq() and to use a well-known irq >>>>> number for a console even if CONFIG_SERIAL_8250_DETECT_IRQ is defined. >>>>> >>>>> There're cases where autoconfig_irq() fails during boot. >>>>> In these cases, the console doesn't work in interrupt mode, >>>>> the mode cannot be changed anymore, and "input overrun" >>>>> (which can make operation mistakes) happens easily. >>>>> This problem happens with high rate every boot once it occurs >>>>> because the boot sequence is always almost same. >>>>> >>>>> autoconfig_irq() assumes that a CPU can handle an interrupt from a serial >>>>> during the waiting time, but there're some cases where the CPU cannot >>>>> handle the interrupt for longer than the time. It completely depends on >>>>> how other functions work on the CPU. Ideally, autoconfig_irq() should be >>>>> fixed >>>> Thank you for your comments. >>>> It completely depends on how long some other driver has interrupts disabled, Agree. >>>> which is a problem that needs fixed _in that driver_. autoconfig_irq() does not >>>> need fixing. Peter, ideally, you're right. However, we cannot assume how long other drivers disable interrupts. That's why I introduced this workaround. In my opinion, a console is important and always should be available even if other drivers have a bad behavior. >>>> >>>> A typical cause of this behavior is printk spew from overly-instrumented >>>> debugging of some driver (trace_printk is better suited to heavy instrumentation). >>>> >>> >>> Peter, I understand what you're saying, however the problem is that this is the >>> serial driver which is one of, if not _the_ main debug output mechanism in the >>> kernel. I tried fixing the one of them, 8250 driver. "[PATCH 1/2] serial: 8250: Fix autoconfig_irq() to avoid race conditions". https://lkml.org/lkml/2015/6/5/218 The problem can happen when the drivers show just info level message during boot. It depends on the timing completely. >>> >>> I'm not sure I agree with doing this patch, but perhaps an easier approach would >>> be to add a debug kernel parameter like "serial.force_irq=4" to force the irq to >>> a previously known good value? That way the issue of the broken driver can be >>> debugged. Prarit, I think it's good if the serial id can be also specified. I thought adding irq option to "console" kernel parameter before, but it was not good idea because the impact was not small and passing irq# was not required in most cases. - PnP serial device can get valid irq# - Most on-board serial for console use legacy fixed irq#, like irq3, irq4. So, I designed this patch like below. - It allows console serial to skip autoconfig_irq. Actually I assumes console serial uses a typical irq# in old_serial_port[], but it's the same case where CONFIG_SERIAL_8250_DETECT_IRQ is not defined. - It allows non-console serial to kick autoconfig_irq. setserial command can try it again later if it fails on boot time. >> >> For debugging purposes, can you use a kernel built with >> CONFIG_SERIAL_8250_DETECT_IRQ=n? I know "CONFIG_SERIAL_8250_DETECT_IRQ=n" is one of solutions because autoconfig_irq() is not used on boot time in this case. However, I know some Linux distributions actually use this config and I assume someone may require it. >> >> What is the platform and device type? >> > > Taichi? I saw the original problem on x86-64 platforms with RHEL6.6. The serial was SOL(serial over LAN), not registered as PnP device. Thanks, Taichi > > P. > >> Regards, >> Peter Hurley >> >> >>> P. >>> >>>> Regards, >>>> Peter Hurley >>>> >>>> >>>>> to control these cases but as long as auto_irq algorithm is used, >>>>> it's hard to know which CPU can handle an interrupt from a serial and >>>>> to assure the interrupt of the CPU is enabled during auto_irq. >>>>> Meanwhile for legacy consoles, they actually don't require auto_irq >>>>> because they basically use well-known irq number. >>>>> For non-console serials, this workaround is not required >>>>> because setserial command can kick autoconfig_irq() again for them. >>>>> >>>>> Signed-off-by: Taichi Kageyama <t-kageyama@xxxxxxxxxxxxx> >>>>> Cc: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> >>>>> --- >>>>> drivers/tty/serial/8250/8250_core.c | 17 +++++++++++++++++ >>>>> 1 files changed, 17 insertions(+), 0 deletions(-) >>>>> >>>>> diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_core.c >>>>> index 6bf31f2..60fda28 100644 >>>>> --- a/drivers/tty/serial/8250/8250_core.c >>>>> +++ b/drivers/tty/serial/8250/8250_core.c >>>>> @@ -65,6 +65,8 @@ static int serial_index(struct uart_port *port) >>>>> >>>>> static unsigned int skip_txen_test; /* force skip of txen test at init time */ >>>>> >>>>> +static unsigned int skip_cons_autoirq; /* force skip autoirq for console */ >>>>> + >>>>> /* >>>>> * Debugging. >>>>> */ >>>>> @@ -3336,6 +3338,9 @@ serial8250_register_ports(struct uart_driver *drv, struct device *dev) >>>>> if (skip_txen_test) >>>>> up->port.flags |= UPF_NO_TXEN_TEST; >>>>> >>>>> + if (uart_console(&up->port) && skip_cons_autoirq) >>>>> + up->port.flags &= ~UPF_AUTO_IRQ; >>>>> + >>>>> uart_add_one_port(drv, &up->port); >>>>> } >>>>> } >>>>> @@ -3875,6 +3880,9 @@ int serial8250_register_8250_port(struct uart_8250_port *up) >>>>> if (skip_txen_test) >>>>> uart->port.flags |= UPF_NO_TXEN_TEST; >>>>> >>>>> + if (uart_console(&uart->port) && skip_cons_autoirq) >>>>> + uart->port.flags &= ~UPF_AUTO_IRQ; >>>>> + >>>>> if (up->port.flags & UPF_FIXED_TYPE) >>>>> uart->port.type = up->port.type; >>>>> >>>>> @@ -3936,6 +3944,10 @@ void serial8250_unregister_port(int line) >>>>> uart->port.flags &= ~UPF_BOOT_AUTOCONF; >>>>> if (skip_txen_test) >>>>> uart->port.flags |= UPF_NO_TXEN_TEST; >>>>> + >>>>> + if (uart_console(&uart->port) && skip_cons_autoirq) >>>>> + uart->port.flags &= ~UPF_AUTO_IRQ; >>>>> + >>>>> uart->port.type = PORT_UNKNOWN; >>>>> uart->port.dev = &serial8250_isa_devs->dev; >>>>> uart->capabilities = 0; >>>>> @@ -4044,6 +4056,9 @@ MODULE_PARM_DESC(nr_uarts, "Maximum number of UARTs supported. (1-" __MODULE_STR >>>>> module_param(skip_txen_test, uint, 0644); >>>>> MODULE_PARM_DESC(skip_txen_test, "Skip checking for the TXEN bug at init time"); >>>>> >>>>> +module_param(skip_cons_autoirq, uint, 0644); >>>>> +MODULE_PARM_DESC(skip_cons_autoirq, "Skip auto irq for console during boot"); >>>>> + >>>>> #ifdef CONFIG_SERIAL_8250_RSA >>>>> module_param_array(probe_rsa, ulong, &probe_rsa_count, 0444); >>>>> MODULE_PARM_DESC(probe_rsa, "Probe I/O ports for RSA"); >>>>> @@ -4070,6 +4085,8 @@ static void __used s8250_options(void) >>>>> module_param_cb(share_irqs, ¶m_ops_uint, &share_irqs, 0644); >>>>> module_param_cb(nr_uarts, ¶m_ops_uint, &nr_uarts, 0644); >>>>> module_param_cb(skip_txen_test, ¶m_ops_uint, &skip_txen_test, 0644); >>>>> + module_param_cb(skip_cons_autoirq, ¶m_ops_uint, >>>>> + &skip_cons_autoirq, 0644); >>>>> #ifdef CONFIG_SERIAL_8250_RSA >>>>> __module_param_call(MODULE_PARAM_PREFIX, probe_rsa, >>>>> ¶m_array_ops, .arr = &__param_arr_probe_rsa, >>>>> >>>> >> ��.n��������+%������w��{.n�����{��ǫ����{ay�ʇڙ���f���h������_�(�階�ݢj"��������G����?���&��