Re: edma: "3-byte" transfers and masked writes in general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/03/2016 04:29 PM, Matthijs van Duin wrote:
> On 3 October 2016 at 14:03, Peter Ujfalusi <peter.ujfalusi@xxxxxx> wrote:
>> I bet that with off=2, len=2:
>>   00 00 10 11  ff ff ff ff  ff ff ff ff  ff ff ff ff
> 
> That was a pretty safe bet :-)

Hehe, yeah, it was ;)

>> What happens of you do the copy from RAM to RAM? Does the data correct in that
>> case?
> 
> Yes, and in fact I suspect ditto for most peripherals. I tested the
> ethernet dma descriptor memory because I already knew of its ill
> behaviour.

Or not, with most peripherals we use constant addressing on the IP side and in
the IP usually a register which tells the IP about the data type. This is the
case for McASP at least. In case of 24 bit data we might have 1 byte of
'garbage' arriving to McASP it is going to ignore it.

For us what matters is that the eDMA itself can read and write any alignment
to/from memory and this is what we advertise via the DMAengine to clients.

> Not supporting 16-bit writes even though most fields of the
> dma descriptors are 16-bit. Nicely done.

I'm sure this is not that unique :(

>> I guess if you change the offset in src it is behaving correctly?
> 
> It seems the data is rotated into place whenever it can get away with it:
> 
> 11 12 13 10  ff ff ff ff  ff ff ff ff  ff ff ff ff  (srcoff=1,
> dstoff=0, len=1..3)
> 11 12 13 14  ff ff ff ff  ff ff ff ff  ff ff ff ff  (srcoff=1, dstoff=0, len=4)
> 12 13 10 11  ff ff ff ff  ff ff ff ff  ff ff ff ff  (srcoff=2,
> dstoff=0, len=1..2)
> 12 13 14 15  ff ff ff ff  ff ff ff ff  ff ff ff ff  (srcoff=2,
> dstoff=0, len=3..4)
> 13 10 11 12  ff ff ff ff  ff ff ff ff  ff ff ff ff  (srcoff=3, dstoff=0, len=1)
> 13 14 15 16  ff ff ff ff  ff ff ff ff  ff ff ff ff  (srcoff=3,
> dstoff=0, len=2..4)

Thanks.

>> But based on this nice table the access to the CPPI TX/RX buffer descriptor
>> memory via eDMA is only valid with 32bit aligned 32bit word access.
> 
> Or 64-bit-aligned 64-bit, or 128-bit-aligned multiple-of-128-bit. And
> reads of any size and alignment.

Yep, the transfer needs to be aligned to multiple of 32 bits in these cases.

> This is general what I'd expect for 32-bit targets if they don't
> support masked writes. That's why the hwspinlock module surprised me
> with the weird set of accesses it supports. Of course there's probably
> no sane reason why anyone would want to do edma from or to it, but if
> it can happen there it can happen elsewhere.

BTW: what happens if you do the copy with CPU in unaligned manner to the
ethernet dma descriptor memory? Is it going through fine or the same type of
corruption happens?

>> I checked again several eDMA3 documents and there is no word about alignment
>> restriction.
> 
> None of this is documented anywhere. Yay for empirical science :P

In any case I don't see this as eDMA related issue, it is more SoC internal
behaviour/integration issue and if some driver for an IP faces similar issue
it is the IP driver's responsibility to use the DMA in a compatible way.

We can not warn users for non 32bit aligned use all the time, since from eDMA
point of view it is correct to use any alignment for src, dst or length.

> From eDMA's point of view it's not a restriction since it's sending
> correct requests, technically. And to be fair I'm inclined to agree
> with it and consider the eth dma desc memory's behaviour to be a
> silicon bug. Assigning blame is ultimately not really useful though,
> the end result is still a usage restriction for edma from or to that
> target.
> 
>> What is the error you see in logs?
> 
> Whatever error I deliberately generate to dissect for study. E.g. a
> 3-byte transfer to 0x45000000 logs a failed 4-byte write. A 2-byte
> transfer to 0x4500001f logs a failed 32-byte write to 0x45000010.
> 
>> Can you print out the paRAM used for the memcopy? Probably two examples so I
>> can see what are you changing between different runs.
> 
> Just the simplest possible: src/dst/acnt set as desired, bcnt==1, no
> options set, stride irrelevant. I'm actually submitting it directly to
> the TC and polling for completion but I had already determined that
> there's no difference between transfer requests submitted this way vs
> submitted by the channel controller.

OK. In Linux we do not touch the TC, all setup is via CC so we can leverage
the priorities and don't need to do the hassle to do things manually via the
TC(s).

-- 
Péter
--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux PCI]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux