Hello All, On Fri, Jun 28, 2019 at 01:51:25PM +0200, Geert Uytterhoeven wrote: > Hi Eugeniu, > > On Wed, Jun 26, 2019 at 7:34 PM Eugeniu Rosca <roscaeugeniu@xxxxxxxxx> wrote: > > On Mon, Jun 24, 2019 at 02:35:38PM +0200, Geert Uytterhoeven wrote: > > > This patch series attempts to fix the issues Eugeniu Rosca reported > > > seeing, where .flush_buffer() interfered with transmit DMA operation[*]. > > > > > > There's a third patch "dmaengine: rcar-dmac: Reject zero-length slave > > > DMA requests", which is related to the issue, but further independent, > > > hence submitted separately. > > > > > > Eugeniu: does this fix the issues you were seeing? > > > > Many thanks for both sh-sci and the rcar-dmac patches. > > The fixes are very much appreciated. > > > > > Geert Uytterhoeven (2): > > > serial: sh-sci: Fix TX DMA buffer flushing and workqueue races > > > serial: sh-sci: Terminate TX DMA during buffer flushing > > > > > > drivers/tty/serial/sh-sci.c | 33 ++++++++++++++++++++++++--------- > > > 1 file changed, 24 insertions(+), 9 deletions(-) > > > > I reserved some time to get a feeling about how the patches behave on > > a real system (H3-ES2.0-ULCB-KF-M06), so here come my observations. > > Thanks for your extensive testing! > > > First of all, the issue I have originally reported in [0] is only > > reproducible in absence of [4]. So, one of my questions would be how > > do you yourself see the relationship between [1-3] and [4]? > > I consider them independent. > Just applying [4] would fix the issue for the console only, while the > race condition can still be triggered on other serial ports. > > > That said, all my testing assumes: > > - Vanilla tip v5.2-rc6-15-g249155c20f9b with [4] reverted. > > - DEBUG is undefined in {sh-sci.c,rcar-dmac.c}, since I've noticed > > new issues arising in the debug build, which are unrelated to [0]. > > > > Below is the summary of my findings: > > > > Version IS [0] Is console Error message when > > (vanilla+X) reproduced? usable after [0] [0] is reproduced > > is reproduced? > > ------------------------------------------------------------ > > -[4] Yes No [5] > > -[4]+[1] Yes No - > > -[4]+[2] Yes Yes [5] > > -[4]+[3] Yes Yes [6] > > -[4]+[1]+[2] No - - > > -[4]+[1]+[2]+[3] No - - > > pure vanilla No - - > > > > This looks a little too verbose, but I thought it might be interesting. > > Thanks, it's very helpful to provide these results. > > > The story which I see is that [1] does not fix [0] alone, but it seems > > to depend on [2]. Furthermore, if cherry picked alone, [1] makes the > > matters somewhat worse in the sense that it hides the error [5]. > > OK. > > > My only question is whether [1-3] are supposed to replace [4] or they > > are supposed to happily coexist. Since I don't see [0] being reproduced > > They are meant to coexist. > > > with [1-3], I personally prefer to re-enable DMA on SCIF (when the > > latter is used as console) so that more features and code paths are > > exercised to increase test coverage. > > If a serial port is used as a console, the port is used for both DMA > (normal use) and PIO (serial console output). The latter can have a > negative impact on the former, aggravating existing bugs, or triggering > more races, even in the hardware. So I think it's better to be more > cautious and keep DMA disabled for the console. Agreed. Just a note for the record that [4] was the easiest way to resolve the reported problem [0] but an alternative solution would be to implement DMA support for ttySC console ports which will be non-trivial to implement and test due to the potential for deadlocks in console write critical paths where various locks are held with interrupts disabled. I see only one tty serial driver which implements console DMA support, drivers/tty/serial/mpsc.c, and perhaps there is a good reason why there are no other examples? > > [0] https://lore.kernel.org/lkml/20190504004258.23574-3-erosca@xxxxxxxxxxxxxx/ > > [1] https://patchwork.kernel.org/patch/11012983/ > > ("serial: sh-sci: Fix TX DMA buffer flushing and workqueue races") > > [2] https://patchwork.kernel.org/patch/11012987/ > > ("serial: sh-sci: Terminate TX DMA during buffer flushing") > > [3] https://patchwork.kernel.org/patch/11012991/ > > ("dmaengine: rcar-dmac: Reject zero-length slave DMA requests") > > [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=099506cbbc79c0 > > ("serial: sh-sci: disable DMA for uart_console") > > > > [5] rcar-dmac e7300000.dma-controller: Channel Address Error > > [6] rcar-dmac e7300000.dma-controller: rcar_dmac_prep_slave_sg: bad parameter: len=1, id=19 > > sh-sci e6e88000.serial: Failed preparing Tx DMA descriptor > > Gr{oetje,eeting}s, > > Geert > > -- > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx > > In personal conversations with technical people, I call myself a hacker. But > when I'm talking to journalists I just say "programmer" or something like that. > -- Linus Torvalds -- Regards, George