Re: [PATCH 02/13] dmaengine: edma: Optimize memcpy operation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Wed, Oct 14, 2015 at 06:02:18PM +0300, Peter Ujfalusi wrote:
> On 10/14/2015 05:41 PM, Vinod Koul wrote:
> > On Wed, Oct 14, 2015 at 04:12:13PM +0300, Peter Ujfalusi wrote:
> >> @@ -1320,41 +1317,92 @@ static struct dma_async_tx_descriptor *edma_prep_dma_memcpy(
> >>  	struct dma_chan *chan, dma_addr_t dest, dma_addr_t src,
> >>  	size_t len, unsigned long tx_flags)
> >>  {
> >> -	int ret;
> >> +	int ret, nslots;
> >>  	struct edma_desc *edesc;
> >>  	struct device *dev = chan->device->dev;
> >>  	struct edma_chan *echan = to_edma_chan(chan);
> >> -	unsigned int width;
> >> +	unsigned int width, pset_len;
> >>  
> >>  	if (unlikely(!echan || !len))
> >>  		return NULL;
> >>  
> >> -	edesc = kzalloc(sizeof(*edesc) + sizeof(edesc->pset[0]), GFP_ATOMIC);
> >> +	if (len < SZ_64K) {
> >> +		/*
> >> +		 * Transfer size less than 64K can be handled with one paRAM
> >> +		 * slot. ACNT = length
> >> +		 */
> >> +		width = len;
> >> +		pset_len = len;
> >> +		nslots = 1;
> >> +	} else {
> >> +		/*
> >> +		 * Transfer size bigger than 64K will be handled with maximum of
> >> +		 * two paRAM slots.
> >> +		 * slot1: ACNT = 32767, length1: (length / 32767)
> >> +		 * slot2: the remaining amount of data.
> >> +		 */
> >> +		width = SZ_32K - 1;
> >> +		pset_len = rounddown(len, width);
> >> +		/* One slot is enough for lengths multiple of (SZ_32K -1) */
> > 
> > Hmm so does this mean if I have 140K transfer, it will do two 64K for 1st
> > slot and 12K in second slot ?
> 
> Not exactly. If the size is less than 64K it can be done with one 'burst' but
> if it is bigger we need to have two sets of transfer:
> 1. 32K blocks
> 2. the remaining data
> 
> so in case of 140K:
> 4 x 32K followed by 12K

Okay this part wasn't very clear to me, can you please add some comment
explaining this bit

> 
> > 
> > Is there a limit on 'blocks' of 64K we can do here?
> 
> 32767 32K blocks is the limit.
> 
> The 64K burst is only possible if the whole transfer is less less than 64K.
> With the ACNT counter we can transfer 64K - 1 bytes, but if this is not enough
> we need to use the BCNT counter and for that to work the the distance between
> the start of 'slot n' and the start of 'slot n+1' need to be less than 32K,
> this is the reason why we have 32K 'blocks' to transfer first followed by the
> remaining.

Okay IIUC, we have option to single burst if its less that 64K using one
slot, otherwise split to 32K chunk with 2 slots, or would it be N in that
case

Really need more documentation here :)
-- 
~Vinod
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Device Tree Compilter]     [Device Tree Spec]     [Linux Driver Backports]     [Video for Linux]     [Linux USB Devel]     [Linux PCI Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Yosemite Backpacking]
  Powered by Linux