28.01.2020 17:51, Dmitry Osipenko пишет: > 28.01.2020 17:02, Jon Hunter пишет: >> >> On 16/01/2020 20:10, Dmitry Osipenko wrote: >>> 15.01.2020 12:00, Jon Hunter пишет: >>>> >>>> On 14/01/2020 20:33, Dmitry Osipenko wrote: >>>>> 14.01.2020 18:09, Jon Hunter пишет: >>>>>> >>>>>> On 12/01/2020 17:29, Dmitry Osipenko wrote: >>>>>>> I was doing some experiments with I2C and noticed that Tegra APB DMA >>>>>>> driver crashes sometime after I2C DMA transfer termination. The crash >>>>>>> happens because tegra_dma_terminate_all() bails out immediately if pending >>>>>>> list is empty, thus it doesn't release the half-completed descriptors >>>>>>> which are getting re-used before ISR tasklet kicks-in. >>>>>> >>>>>> Can you elaborate a bit more on how these are getting re-used? What is >>>>>> the sequence of events which results in the panic? I believe that this >>>>>> was also reported in the past [0] and so I don't doubt there is an issue >>>>>> here, but would like to completely understand this. >>>>>> >>>>>> Thanks! >>>>>> Jon >>>>>> >>>>>> [0] https://lore.kernel.org/patchwork/patch/675349/ >>>>>> >>>>> >>>>> In my case it happens in the touchscreen driver during of the >>>>> touchscreen's interrupt handling (in a threaded IRQ handler) + CPU is >>>>> under load and there is other interrupts activity. So what happens here >>>>> is that the TS driver issues one I2C transfer, which fails with >>>>> (apparently bogus) timeout (because DMA descriptor is completed and >>>>> removed from the pending list, but tasklet not executed yet), and then >>>>> TS immediately issues another I2C transfer that re-uses the >>>>> yet-incompleted descriptor. That's my understanding. >>>> >>>> OK, but what is the exact sequence that it allowing it to re-use the >>>> incompleted descriptor? >>> >>> TDMA driver DMA Client >>> >>> 1. >>> dmaengine_prep() >>> >>> 2. >>> tegra_dma_desc_get() >>> dma_desc = kzalloc() >>> ... >>> tegra_dma_prep_slave_sg() >>> INIT_LIST_HEAD(&dma_desc->tx_list); >>> INIT_LIST_HEAD(&dma_desc->cb_node); >>> list_add_tail(sgreq->node, >>> dma_desc->tx_list) >>> >>> 3. >>> dma_async_issue_pending() >>> >>> 4. >>> tegra_dma_tx_submit() >>> list_splice_tail_init(dma_desc->tx_list, >>> tdc->pending_sg_req) >>> >>> 5. >>> tegra_dma_isr() >>> ... >>> handle_once_dma_done() >>> ... >>> sgreq = list_first_entry(tdc->pending_sg_req) >>> list_del(sgreq->node); >>> ... >>> list_add_tail(dma_desc->cb_node, >>> tdc->cb_desc); >>> list_add_tail(dma_desc->node, >>> tdc->free_dma_desc); >> >> Isn't this the problem here, that we have placed this on the free list >> before we are actually done? >> >> It seems to me that there could still be a potential race condition >> between the ISR and the tasklet running. > > Yes, this should be addressed by the patch #3 "dmaengine: tegra-apb: > Prevent race conditions of tasklet vs free list". correction (to avoid confusion): it's actually patch #5, my bad