Re: Kernel 6.5 ttyS1 hang with qemu (was Re: [OE-core] Summary of the remaining 6.5 kernel serial issue (and 6.5 summary)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 2023-10-15 at 17:31 +0200, Greg Kroah-Hartman wrote:
> On Sun, Oct 15, 2023 at 01:29:48PM +0100, Richard Purdie wrote:
> > On Sat, 2023-10-14 at 12:13 +0100, Richard Purdie via
> > lists.openembedded.org wrote:
> > > On Sat, 2023-10-14 at 10:41 +0100, Richard Purdie wrote:
> > > > Brief summary:
> > > > 
> > > > We're seeing an issue on x86_64 with 6.5.X where data appears to be
> > > > left in the transmission buffer and not sent to the port on the second
> > > > serial port (ttyS1) until we trigger it with intervention.
> > > > 
> > > > Paul Gortmaker did some painful bisection over a few days down to:
> > > > 
> > > > serial: core: Start managing serial controllers to enable runtime PM
> > > > https://lore.kernel.org/linux-serial/1431f5b4-fb39-483b-9314-ed2b7c118c11@xxxxxxxxx/T/#t
> > > 
> > > Having poked around a bit and knowing nothing about any of this, should
> > > this bit of new code added in the above commit to __uart_start() in
> > > serial_core.c:
> > > 
> > > 	/*
> > > 	 * Start TX if enabled, and kick runtime PM. If the device is not
> > > 	 * enabled, serial_port_runtime_resume() calls start_tx()
> > > again
> > > 	 * after enabling the device.
> > > 	 */
> > > 	if (pm_runtime_active(&port_dev->dev))
> > > 		port->ops->start_tx(port);
> > > 
> > > 
> > > actually be something like:
> > > 
> > > 
> > > 	if (pm_runtime_active(&port_dev->dev) || !pm_runtime_enabled(&port_dev->dev))
> > > 		port->ops->start_tx(port);
> > > 
> > > 
> > > since there are uarts that don't enable runtime PM?
> > > 
> > > I notice that 16550A I'm using doesn't set UART_CAP_RPM and since we
> > > have data left in the xmit buffer (I managed to confirm that), it is as
> > > if during init, there is a race between the serial probing and the
> > > getty putting data in the buffer? If it weren't statrted, that would
> > > explain things...
> > 
> > The above change didn't work but what does appear to be making a
> > difference is making this code call start_tx unconditionally which is
> > what it did prior to the patch. That does cause a "wake" when there
> > might not be any data but the code handles that gracefully.
> > 
> > I therefore suspect this is the place the issue is, the question is
> > what the right conditions for calling start_tx are?
> > 
> > I'll keep going with testing of that as the intermittent nature does
> > make this hard to know if any change helps or not.
> 
> Can you try the patch below?  I just sent it to Linus and it's from Tony
> to resolve some other pm issues with the serial port code.

Thanks for the pointer to this. I've put it through some testing and
had one failure so far so I suspect this isn't enough unfortunately.

FWIW I was looping the testing on the complete removal of the
conditions and didn't see any failures with that.

Cheers,

Richard

> 
> thanks,
> 
> greg k-h
> 
> From 81a61051e0ce5fd7e09225c0d5985da08c7954a7 Mon Sep 17 00:00:00 2001
> From: Tony Lindgren <tony@xxxxxxxxxxx>
> Date: Thu, 5 Oct 2023 10:56:42 +0300
> Subject: [PATCH 3/4] serial: core: Fix checks for tx runtime PM state
> 
> Maximilian reported that surface_serial_hub serdev tx does not work during
> system suspend. During system suspend, runtime PM gets disabled in
> __device_suspend_late(), and tx is unable to wake-up the serial core port
> device that we use to check if tx is safe to start. Johan summarized the
> regression noting that serdev tx no longer always works as earlier when the
> serdev device is runtime PM active.
> 
> The serdev device and the serial core controller devices are siblings of
> the serial port hardware device. The runtime PM usage count from serdev
> device does not propagate to the serial core device siblings, it only
> propagates to the parent.
> 
> In addition to the tx issue for suspend, testing for the serial core port
> device can cause an unnecessary delay in enabling tx while waiting for the
> serial core port device to wake-up. The serial core port device wake-up is
> only needed to flush pending tx when the serial port hardware device was
> in runtime PM suspended state.
> 
> To fix the regression, we need to check the runtime PM state of the parent
> serial port hardware device for tx instead of the serial core port device.
> 
> As the serial port device drivers may or may not implement runtime PM, we
> need to also add a check for pm_runtime_enabled().
> 
> Reported-by: Maximilian Luz <luzmaximilian@xxxxxxxxx>
> Cc: stable <stable@xxxxxxxxxx>
> Fixes: 84a9582fd203 ("serial: core: Start managing serial controllers to enable runtime PM")
> Signed-off-by: Tony Lindgren <tony@xxxxxxxxxxx>
> Tested-by: Maximilian Luz <luzmaximilian@xxxxxxxxx>
> Reviewed-by: Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx>
> Link: https://lore.kernel.org/r/20231005075644.25936-1-tony@xxxxxxxxxxx
> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> ---
>  drivers/tty/serial/serial_core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
> index ca26a8aef2cb..d5ba6e90bd95 100644
> --- a/drivers/tty/serial/serial_core.c
> +++ b/drivers/tty/serial/serial_core.c
> @@ -156,7 +156,7 @@ static void __uart_start(struct uart_state *state)
>  	 * enabled, serial_port_runtime_resume() calls start_tx() again
>  	 * after enabling the device.
>  	 */
> -	if (pm_runtime_active(&port_dev->dev))
> +	if (!pm_runtime_enabled(port->dev) || pm_runtime_active(port->dev))
>  		port->ops->start_tx(port);
>  	pm_runtime_mark_last_busy(&port_dev->dev);
>  	pm_runtime_put_autosuspend(&port_dev->dev);





[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux PPP]     [Linux FS]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Linmodem]     [Device Mapper]     [Linux Kernel for ARM]

  Powered by Linux