Re: [PATCH 3/3] spi: bcm2835: add module parameter to configure minimum length for dma

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Mar 24, 2019 at 09:58:27AM +0100, kernel@xxxxxxxxxxxxxxxx wrote:
> On 22.03.2019, at 13:36, Lukas Wunner <lukas@xxxxxxxxx> wrote:
> > On Sun, Feb 24, 2019 at 04:23:11PM +0000, kernel@xxxxxxxxxxxxxxxx wrote:
> > > +/* define dma min number of bytes to use in dma mode with value validation */
> > > +static int dma_min_bytes_limit_set(const char *val,
> > > +				   const struct kernel_param *kp)
> > > +{
> > > +	unsigned int v;
> > > +
> > > +	if (kstrtouint(val, 10, &v))
> > > +		return -EINVAL;
> > > +	/* value needs to be a multiple of 4 */
> > > +	if (v % 4) {
> > > +		pr_err("dma_min_bytes_limit needs to be a multiple of 4\n");
> > > +		return -EINVAL;
> > > +	}
> > 
> > Transfers don't need to be a multiple of 4 to be eligible for DMA,
> > so this check can be dropped.
> 
> I definitely did not want to write a custom module argument parser
> but if i remember correctly there is one limitation on the transmission path
> where you would hit some inefficiencies in the DMA code when you run
> transfers that are not a multiple of 4 - especially for short transfers.

No, the *length* of a transfer in DMA mode doesn't need to be a multiple
of 4.  You just write the length to the DLEN register and the chip counts
that down to zero while clocking out bytes.  Once it reaches zero, it
stops clocking out bytes.  Because the FIFO is accessed with 32-bit width
in DMA mode, you'll leave a few extra bytes behind in the TX FIFO if DLEN
is not a multiple of 4, so you have to clear the TX FIFO before the next
transfer is commenced.  But the driver does all that.

The inefficiency you're referring to only occurs if you have a transfer
that spans multiple non-contiguous pages and in the first page, it starts
at an offset that's not a multiple of 4.  In that case we transfer the
first few bytes of the first page via programmed I/O such that the offset
in the first page becomes a multiple of 4 and then we can switch to
transferring by DMA.  We handle that just fine since 3bd7f6589f67.

Note, most clients transfer bytes from a kmalloc'ed allocation and those
are always contiguous in memory, so the above is completely irrelevant
for them.  It's only relevant for vmalloc'ed allocations, which are
probably rare in SPI client drivers.

Thanks,

Lukas



[Index of Archives]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux