Re: [PATCH] tty: serial: qcom_geni_serial: Fix RX cancel command failure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting skakit@xxxxxxxxxxxxxx (2020-02-14 05:17:01)
> On 2020-02-13 04:19, Stephen Boyd wrote:
> >     driver_probe_device+0x70/0x140
> >     __driver_attach_async_helper+0x7c/0xa8
> >     async_run_entry_fn+0x60/0x178
> >     process_one_work+0x33c/0x640
> >     worker_thread+0x2a0/0x470
> >     kthread+0x128/0x138
> >     ret_from_fork+0x10/0x18
> >    Code: 1aca096a 911e0129 b940012b 7100054a (b800450b)
> I think the most probable explanation of the crash is, set_termios call 
> is starting RX engine and RX engine is sampling some garbage data from 
> line, and by the time startup is called, we have few data to read.
> How frequently you are able to see this crash? because internally we are 
> unable to reproduce it.

How is set_termios involved? Is that starting the RX side before
uart_startup() is called? Sorry I haven't looked into the code flow very
deeply here.

It seems to happen when the bluetooth driver probes so maybe constantly
adding and removing the hci_uart module will cause this to happen for
you? I also run the kernel with many debug options enabled, so maybe try
enabling all the debug stuff? I see it randomly right now so I'm not
sure.

> > 
> > 
> > This seems to be the problematic line. We didn't call handle_rx() from
> > the stop_rx() path before. And this qcom_geni_serial_stop_rx() function
> > is called from qcom_geni_serial_startup(), but most importantly, we 
> > call
> > into this function from startup before we allocate memory for the
> > port->rx_fifo member (see the devm_kcalloc() later in
> > qcom_geni_serial_port_setup() and see how it's after we stop rx).
> > 
> > Why do we need to flush the rx buffer by reading it into the software
> > buffer? Can't we simply ack any outstanding RX interrupts in the
> > hardware when we're stopping receive?
> We can't simply ack RX_LAST interrupt, there is a sticky bit that get 
> set on HW level(not exposed to SW) with RX last interrupt. The only way 
> to clear it is flush out RX FIFO HW buffer. The sticky bit can create 
> problem for future transfers if remained uncleared.
> How about we allocate buffer to port->rx_fifo in probe itself?

Ok. If we have to read the rx fifo to flush the buffer then perhaps
write another function that qcom_geni_serial_stop_rx() can use to
indicate that it wants to throw away whatever is in the rx fifo? Or
adjust handle_rx() to check for a NULL fifo pointer and throw it away in
that case? When we're setting up this device I don't understand why we
would want to read anything out of the rx fifo that was there before the
driver started.




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux PPP]     [Linux FS]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Linmodem]     [Device Mapper]     [Linux Kernel for ARM]

  Powered by Linux