On Thu, 21 Dec 2023 10:56:39 -0500 Hugo Villeneuve <hugo@xxxxxxxxxxx> wrote: > On Wed, 20 Dec 2023 17:40:42 +0200 > Andy Shevchenko <andriy.shevchenko@xxxxxxxxx> wrote: > > > On Tue, Dec 19, 2023 at 12:18:46PM -0500, Hugo Villeneuve wrote: > > > From: Hugo Villeneuve <hvilleneuve@xxxxxxxxxxxx> > > > > > > If an error occurs during probing, the sc16is7xx_lines bitfield may be left > > > in a state that doesn't represent the correct state of lines allocation. > > > > > > For example, in a system with two SC16 devices, if an error occurs only > > > during probing of channel (port) B of the second device, sc16is7xx_lines > > > final state will be 00001011b instead of the expected 00000011b. > > > > > > This is caused in part because of the "i--" in the for/loop located in > > > the out_ports: error path. > > > > > > Fix this by checking the return value of uart_add_one_port() and set line > > > allocation bit only if this was successful. This allows the refactor of > > > the obfuscated for(i--...) loop in the error path, and properly call > > > uart_remove_one_port() only when needed, and properly unset line allocation > > > bits. > > > > > > Also use same mechanism in remove() when calling uart_remove_one_port(). > > > > Yes, this seems to be the correct one to fix the problem described in > > the patch 1. I dunno why the patch 1 even exists. > > Hi, > this will indeed fix the problem described in patch 1. > > However, if I remove patch 1, and I simulate the same probe error as > described in patch 1, now we get stuck forever when trying to > remove the driver. This is something that I observed before and > that patch 1 also corrected. > > The problem is caused in sc16is7xx_remove() when calling this function > > kthread_flush_worker(&s->kworker); > > I am not sure how best to handle that without patch 1. Also, if we manage to get past kthread_flush_worker() and kthread_stop() (commented out for testing purposes), we get another bug: # rmmod sc16is7xx ... crystal-duart-24m already disabled WARNING: CPU: 2 PID: 340 at drivers/clk/clk.c:1090 clk_core_disable+0x1b0/0x1e0 ... Call trace: clk_core_disable+0x1b0/0x1e0 clk_disable+0x38/0x60 sc16is7xx_remove+0x1e4/0x240 [sc16is7xx] This one is caused by calling clk_disable_unprepare(). But clk_disable_unprepare() has already been called in probe error handling code. Patch 1 also fixed this... Hugo Villeneuve