Hi Vinod, On Mon, Mar 16, 2015 at 11:01 PM, Rameshwar Sahu <rsahu@xxxxxxx> wrote: > Hi Vinod, > > On Mon, Mar 16, 2015 at 9:56 PM, Vinod Koul <vinod.koul@xxxxxxxxx> wrote: >> On Mon, Mar 16, 2015 at 05:24:34PM +0530, Rameshwar Sahu wrote: >>> >> >> +static void xgene_dma_free_desc_list_reverse(struct xgene_dma_chan *chan, >>> >> >> + struct list_head *list) >>> >> > do we really care about free order? >>> >> >>> >> Yes it start dellocation of descriptor by tail. >>> > and why by tail is not clear. >>> We can free allocated descriptor in forward order from head or in >>> reverse order, I just followed here fsldma.c driver. >>> Does this make sense ?? >> No, you have two APIs to free list. Why do you need two? > > Yes, basically we have tow API to free list. > xgene_dma_free_desc_list_reverse will call if any failure in > allocation of memory from DMA pool in prep routines. > Like e.g. in prep routing we have some descriptors allocated and still > need to get descriptor to complete the DMA request and failure happen, > so we need to free all allocated descriptor. > >> >>> >>> >>> > >>> >> > where are you mapping dma buffers? >>> >> >>> >> I didn't get you here. Can you please explain me here what you mean. >>> >> As per my understanding client should map the dma buffer and give the >>> >> physical address and size to this callback prep routines. >>> > not for memcpy, that is true for slave transfers >>> > >>> > For mempcy the idea is that drivers will do buffer mapping >>> >>> Still I am clear here, why memcpy will do buffer mapping, I see other >>> drivers and also async_memcpy.c , they only map it and pass mapped >>> physical dma address to driver. >>> >>> Buffer mapping mean you here is dma_map_xxx ?? Am I correct. >> Yes > > I have confusion here, I don't see any driver dma buffer mapping in > prep_dma_memcpy. > Can you please clear me here if driver does this on behalf of client, > like any example so that I can proceed further. Any comment here ?? >> >>> >>> > >>> >> > why are you calling this here, status check shouldnt do this... >>> >> >>> >> Okay, I will remove it. >>> >> >>> >> >>> >> >> + spin_unlock_bh(&chan->lock); >>> >> >> + return DMA_IN_PROGRESS; >>> >> > residue here is size of transacation. >>> >> >>> >> We can't calculate here residue size. We don't have any controller >>> >> register which will tell about remaining transaction size. >>> > Okay if you cant calculate residue why do we have this fn? >>> >>> So basically case here for me is completion of dma descriptor >>> submitted to hw is not same as order of submission to hw. >>> So scenario coming in multithread running :e.g. let's assume we have >>> submitted two descriptors first has cookie 1001 and second has 1002, >>> now 1002 is completed first, so updated last_completed_cookie as 1002 >>> but not yer checked for dma_tx_status, and then first cookie completes >>> and update last_completed_cookie as 1001, now second transaction check >>> for tx_status and it get DMA_IN_PROGRESS, because >>> last_completed_cookie(1001) is less than second transaction's >>> cookie(1002). >>> >>> Due to this issue I am traversing that transaction in pending list and >>> running list, if not there means we are done. >>> >>> Does this make sense?? >> That only convinces me that there is something not so correct. >> >> To help me understand pls let me know if below is fine: >> - for a physical channel, do you submit multiple transactions? > > Yes > >> - if yes, how does DMA deal with multiple transactions, how does it schedule >> them? > > So , basically we submit multiple descriptor to dma physical channel, > and dma engine execute it one by one and give us completion callback. > So in this way we expect callback on same order as submission order > and it does also, no issue. > > But problem is with supporting p+q offload, here we have P > functionality supports in dma physical channel 0 and Q functionality > supports in dma physical channel 1. So for pq we need to submit two > descriptor, one to channel 0 and second to channel1, in this case we > can't expect the completion order, because channnel 0 can finish P > before Q or vice versa, and we need to wait to complete both before > calling client callback() and completing cookie. > Second thing we submit memcpy and sg on same channel, and can complete > before even though if it submitted after PQ. So our SoC dma engine hw design idea was to get more throughput while running two channel concurrent and calculating the P and Q together, but somehow now today we came to scenario where running P and Q on different channel causing hang to dmaengine, some hw bug, So now I am going to support P and Q generation in same channel, so above mentioned cookie status scenario will never come. I will send you the patch for review. Thanks, > >> >> -- >> ~Vinod >> -- >> To unsubscribe from this list: send the line "unsubscribe dmaengine" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html