Petr, On 03/18/2015 11:33 PM, Petr Kulhavy wrote: > Hi Peter, > > Yes, we do not use DT. > Our board design is very close to the da850evm reference board. So if you have > a chance of getting hold of it you could try on that one. I have been trying to reproduce this on my OMAP-L138-EVM (da850evm) but was not able. There is a big difference in your an my setup: the MMC on my EVM is connected to MMC/SD0 interface while in your setup it seams to be connected to MMC/SD1. The DMA requests for MMC/SD0 is eDMA CC0 16/17, while the MMC/SD1 is eDMA CC1 28/29. So they are handled by different channel controllers. I would not rule out that the support for the second CC has issues. The information from you logs just points to this direction also: You are writing to the MMC, so CC1 ch29 is active most of the time. In your log the channel number sometimes 65565 ((1 << 16) | 29) sometimes it is 29. In my case it is 17 every time. I think there is something funny regarding to how these channels on CC1 are working and this might be causing the leak for you. Unfortunately I do not have HW where I could use any channel on CC1 so I can not debug this further, but I'll look at the code to see if anything obvious stands out. And something does stand out: arch/arm/common/edma.c: dma_irq_handler() It calls the callback with a wrong interrupt number since it does not take into account the different CC. Can you try something similar to see if it helps: diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c index 6c49887d326e..e1d413c61e9e 100644 --- a/arch/arm/common/edma.c +++ b/arch/arm/common/edma.c @@ -405,7 +405,8 @@ static irqreturn_t dma_irq_handler(int irq, void *data) BIT(slot)); if (edma_cc[ctlr]->intr_data[channel].callback) edma_cc[ctlr]->intr_data[channel].callback( - channel, EDMA_DMA_COMPLETE, + EDMA_CTLR_CHAN(ctlr, channel), + EDMA_DMA_COMPLETE, edma_cc[ctlr]->intr_data[channel].data); } } while (sh_ipr); -- Péter > The "wrong" channel number justmeans that it's on the second controller > (pdev->id is 1), the debug is just printing the compound number > channel+controller. > > I believe that the problem is in edma_terminate_all(), the active edesc is not > freed by the vchan_dma_desc_free_list() according to the below debug trace. > The first alloc/free pair is correct, the other alloc has no free - see the > "Terminate all" line. I also enabled debug in common/edma.c and virt-dma.c. > Note that the "Terminate all" message is printed at the end of > edma_terminate_all() function just before return. Printing it earlier was > disturbing my test. > > So the critical sequence is: > - submit transaction > - issue pending > - IRQ handler (int 3 ???) - doesn't seem to be relevant > - terminate all transfers -> dma stop - this doesn't free > - IRQ handler (int 29) -> callback - probably harmless > > The issue is that edma_execute() calls vchan_next_desc() which removes the > request from vc->dest_issued list > and while the transaction is being processed it's not on any list and > terminate_all_transfers calling vchan_get_all_descriptors doesn't find it and > therefore doesn't free it. > > Is that a plausible explanation? > > Cheers > Petr > > > > Mar 18 20:38:09 barix user.debug kernel: edma-dma-engine edma-dma-engine.1: > ALLOC edesc: 0xc34d5b80 (channel: 65565, sg_len: 9) > Mar 18 20:38:09 barix user.debug kernel: edma-dma-engine edma-dma-engine.1: > vchan c781bc70: txd c34d5b80[1c91]: submitted > Mar 18 20:38:09 barix user.debug kernel: edma-dma-engine edma-dma-engine.1: > first transfer starting on channel 65565 > Mar 18 20:38:09 barix user.debug kernel: EDMA: ER0 00000000 > Mar 18 20:38:09 barix user.debug kernel: EDMA: EER0 20000000 > Mar 18 20:38:09 barix user.debug kernel: edma edma: dma_irq_handler > Mar 18 20:38:09 barix user.debug kernel: edma edma: IPR0 00000008 > Mar 18 20:38:09 barix user.debug kernel: edma edma: dma_irq_handler > Mar 18 20:38:09 barix user.debug kernel: edma edma: IPR0 20000000 > Mar 18 20:38:09 barix user.debug kernel: edma-dma-engine edma-dma-engine.1: > Edma callback: edesc 0xc34d5b80 > Mar 18 20:38:09 barix user.debug kernel: edma-dma-engine edma-dma-engine.1: > Pausing channel 29 > Mar 18 20:38:09 barix user.debug kernel: edma-dma-engine edma-dma-engine.1: > Transfer complete, stopping channel 29 > Mar 18 20:38:09 barix user.debug kernel: EDMA: EER0 00000000 > Mar 18 20:38:09 barix user.debug kernel: edma-dma-engine edma-dma-engine.1: > Setting edesc 0xc34d5b80 to NULL > Mar 18 20:38:09 barix user.debug kernel: edma-dma-engine edma-dma-engine.1: > txd c34d5b80: freeing > Mar 18 20:38:09 barix user.debug kernel: edma-dma-engine edma-dma-engine.1: > FREE edesc: 0xc34d5b80 (channel: 65565) > Mar 18 20:38:09 barix user.debug kernel: edma-dma-engine edma-dma-engine.1: > ALLOC edesc: 0xc2f53900 (channel: 65565, sg_len: 15) > Mar 18 20:38:09 barix user.debug kernel: edma-dma-engine edma-dma-engine.1: > vchan c781bc70: txd c2f53900[1c92]: submitted > Mar 18 20:38:09 barix user.debug kernel: edma-dma-engine edma-dma-engine.1: > first transfer starting on channel 65565 > Mar 18 20:38:09 barix user.debug kernel: EDMA: ER0 00000000 > Mar 18 20:38:09 barix user.debug kernel: EDMA: EER0 20000000 > Mar 18 20:38:09 barix user.debug kernel: edma edma: dma_irq_handler > Mar 18 20:38:09 barix user.debug kernel: edma edma: IPR0 00000008 > Mar 18 20:38:09 barix user.debug kernel: EDMA: EER0 00000000 > Mar 18 20:38:09 barix user.debug kernel: edma-dma-engine edma-dma-engine.1: > Terminate all, setting edesc 0xc2f53900 to NULL > Mar 18 20:38:09 barix user.debug kernel: edma edma: dma_irq_handler > Mar 18 20:38:09 barix user.debug kernel: edma edma: IPR0 20000000 > Mar 18 20:38:09 barix user.debug kernel: edma-dma-engine edma-dma-engine.1: > Edma callback: edesc 0x (null) > Mar 18 20:38:09 barix user.debug kernel: edma-dma-engine edma-dma-engine.1: > Pausing channel 29 > Mar 18 20:38:09 barix user.debug kernel: edma-dma-engine edma-dma-engine.1: > ALLOC edesc: 0xc2f52140 (channel: 65565, sg_len: 16) > Mar 18 20:38:09 barix user.debug kernel: edma-dma-engine edma-dma-engine.1: > vchan c781bc70: txd c2f52140[1c93]: submitted > > > > On 18.03.2015 14:56, Peter Ujfalusi wrote: >> Petr, >> >> On 03/17/2015 09:02 PM, Petr Kulhavy wrote: >>> Hi Peter, >>> >>> thanks a lot for the details. >>> I believe it's not an Ethernet issue, it's really related to the SD card. If >>> we use the USB storage instead of the SD card on our device we don't see the >>> leaks. >> I believe you are not booting with DT. The two eDMA is only supported when >> booting in legacy mode. >> >>> I enabled the dynamic debug and added a debug message for the kzalloc() in >>> edma_prep_slave_sg() and for the kfree() in the edma_desc_free() both to print >>> the pointer address. And it gives an interesting result, see below. >>> >>> You can see that after every alloc (i.e.edma_prep_slave_sg()) edma_execute() >>> is called ("file transfer starting..."), however not all of them end with >>> "Transfer complete". And exactly those are also not freed. >> I did the same on am335x-evmsk and I don't see the same issue (linux-next). It >> does not mean that we do not have the issue you describe, but somehow it is >> not that easy to reproduce. I will try my OMAP-L138 board, which is closer to >> AM1808 than AM335x. >> >>> Unfortunately I do not know how exactly the edma mechanism works with all the >>> callbacks, states, etc. >>> But does it make any sense for you? Can you help me to debug more? >> Not sure at the moment, but I would try to print also in the error cases in >> the callback and track the edesc pointer in every prints to see where the code >> goes. I would enable the prints in the edma_execute as well to see what is >> going on there. >> >> For reference I have these prints: >> [ 924.127638] ALLOC edesc: 0xcd948c40 (channel: 25, sg_len: 1, rounds: 1) >> [ 924.134598] first transfer starting on channel 25 (edesc: 0xcd948c40) >> [ 924.145505] Transfer complete, stopping channel: 25 (edesc: 0xcd948c40) >> [ 924.152561] FREE edesc: 0xcd948c40 (channel: 25) >> [ 924.159223] ALLOC edesc: 0xc916f800 (channel: 25, sg_len: 30, rounds: 2) >> [ 924.166611] first transfer starting on channel 25 (edesc: 0xc916f800) >> [ 924.180541] Intermediate transfer complete on channel: 25 (edesc: >> 0xc916f800) >> [ 924.188054] chan: 25: completed 30 elements, resuming (edesc: 0xc916f800) >> [ 924.195208] Error occurred on channel: 25 (edesc: 0xc916f800), TRIGGERING >> [ 924.204117] Transfer complete, stopping channel: 25 (edesc: 0xc916f800) >> [ 924.211158] FREE edesc: 0xc916f800 (channel: 25) >> [ 924.218407] ALLOC edesc: 0xc6cbce00 (channel: 25, sg_len: 8, rounds: 1) >> [ 924.225396] first transfer starting on channel 25 (edesc: 0xc6cbce00) >> [ 924.236530] Transfer complete, stopping channel: 25 (edesc: 0xc6cbce00) >> [ 924.243506] FREE edesc: 0xc6cbce00 (channel: 25) >> [ 924.248817] ALLOC edesc: 0xcaa5ec00 (channel: 25, sg_len: 12, rounds: 1) >> [ 924.256068] first transfer starting on channel 25 (edesc: 0xcaa5ec00) >> [ 924.268277] Transfer complete, stopping channel: 25 (edesc: 0xcaa5ec00) >> [ 924.275265] FREE edesc: 0xcaa5ec00 (channel: 25) >> >> >> >>> Thanks >>> Petr >>> >>> ALLOC edesc c65d5c80 >>> first transfer starting on channel 65565 >> The channel number does not seams right here. >> >>> ALLOC edesc c5b69640 >> Would be nice to see the channel as well here. But sure the FREE for c65d5c80 >> is missing. Strange, I don't see how this happens. >> I have enabled PREEMPT and still can not reproduce - this was my first idea. >> >>> first transfer starting on channel 65565 >>> Transfer complete, stopping channel 29 >>> FREE edesc c5b69640 >>> ALLOC edesc c58ec580 >>> first transfer starting on channel 65565 >>> Transfer complete, stopping channel 29 >>> FREE edesc c58ec580 >>> ALLOC edesc c5103d80 >>> first transfer starting on channel 65565 >>> ALLOC edesc c61e78c0 >>> first transfer starting on channel 65565 >>> ALLOC edesc c65d6f80 >>> first transfer starting on channel 65565 >>> Transfer complete, stopping channel 29 >>> FREE edesc c65d6f80 >>> ALLOC edesc c5b698c0 >>> first transfer starting on channel 65565 >>> Transfer complete, stopping channel 29 >>> FREE edesc c5b698c0 >>> ALLOC edesc c52244c0 >>> first transfer starting on channel 65565 >>> Transfer complete, stopping channel 29 >>> FREE edesc c52244c0 >>> ALLOC edesc c52244c0 >>> first transfer starting on channel 65565 >>> Transfer complete, stopping channel 29 >>> FREE edesc c52244c0 >>> ALLOC edesc c52244c0 >>> first transfer starting on channel 65565 >>> Transfer complete, stopping channel 29 >>> FREE edesc c52244c0 >>> ALLOC edesc c52244c0 >>> first transfer starting on channel 65565 >>> Transfer complete, stopping channel 29 >>> FREE edesc c52244c0 >>> ALLOC edesc c58ec580 >>> first transfer starting on channel 65565 >>> ALLOC edesc c5b698c0 >>> first transfer starting on channel 65565 >>> ALLOC edesc c5103480 >>> first transfer starting on channel 65565 >>> Transfer complete, stopping channel 29 >>> FREE edesc c5103480 >>> ALLOC edesc c5b69640 >>> first transfer starting on channel 65565 >>> Transfer complete, stopping channel 29 >>> FREE edesc c5b69640 >>> ALLOC edesc c61e62c0 >>> first transfer starting on channel 65565 >>> Transfer complete, stopping channel 29 >>> FREE edesc c61e62c0 >>> ALLOC edesc c5227440 >>> first transfer starting on channel 65565 >>> Transfer complete, stopping channel 29 >>> FREE edesc c5227440 >>> ALLOC edesc c5b69640 >>> first transfer starting on channel 65565 >>> ALLOC edesc c5b69b40 >>> first transfer starting on channel 65565 >>> ALLOC edesc c5233000 >>> first transfer starting on channel 65565 >>> ALLOC edesc c5233dc0 >>> first transfer starting on channel 65565 >>> ALLOC edesc c5233140 >>> first transfer starting on channel 65565 >>> Transfer complete, stopping channel 29 >>> FREE edesc c5233140 >>> ALLOC edesc c5233140 >>> first transfer starting on channel 65565 >>> ALLOC edesc c5233280 >>> first transfer starting on channel 65565 >>> Transfer complete, stopping channel 29 >>> FREE edesc c5233280 >>> > -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html