Re: [PATCH 1/3] serial: qcom-geni: fix hard lockup on buffer flush

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 25, 2024 at 04:40:36PM +0200, Johan Hovold wrote:
> On Mon, Jun 24, 2024 at 10:39:07AM -0700, Doug Anderson wrote:
> > On Mon, Jun 24, 2024 at 6:31 AM Johan Hovold <johan+linaro@xxxxxxxxxx> wrote:
> > >
> > > The Qualcomm GENI serial driver does not handle buffer flushing and used
> > > to print garbage characters when the circular buffer was cleared. Since
> > > commit 1788cf6a91d9 ("tty: serial: switch from circ_buf to kfifo") this
> > > instead results in a lockup due to qcom_geni_serial_send_chunk_fifo()
> > > spinning indefinitely in the interrupt handler.
> > >
> > > This is easily triggered by interrupting a command such as dmesg in a
> > > serial console but can also happen when stopping a serial getty on
> > > reboot.
> > >
> > > Fix the immediate issue by printing NUL characters until the current TX
> > > command has been completed.

> > I don't love this, though it's better than a hard lockup. I will note
> > that it doesn't exactly restore the old behavior which would have
> > (most likely) continued to output data that had previously been in the
> > FIFO but that had been cancelled.
> 
> Ah, yes, you're right. I went back and compared with 6.9 and the effect
> was indeed (often) that the machine felt sluggish when you hit ctrl-c to
> interrupt something like dmesg and the driver would continue to print up
> to 4k characters after that (e.g. 350 ms at 115200).
> 
> The idea here was to fix the lockup regression separately and then have
> the third patch address the buffer flush failure, which could also be
> backported without depending on the kfifo conversion.
> 
> But running with this series since yesterday, I realise there are still
> some unresolved interaction with the console code, which can now trigger
> a soft (instead of hard) lockup on reboot...

I've reworked my series to avoid the remaining lockup, which was due to
v1 not handling some cases where cancelling a command left stale data in
the fifo.

I've also reordered the patches to avoid printing NUL characters as an
intermediate fix.

Johan




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [Linux for Sparc]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux