Re: [PATCH rdma-next 1/2] arm64/io: add memcpy_toio_64

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 24/11/2023 1:45 pm, Jason Gunthorpe wrote:
On Fri, Nov 24, 2023 at 12:58:11PM +0000, Robin Murphy wrote:
diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
index 3b694511b98f..73ab91913790 100644
--- a/arch/arm64/include/asm/io.h
+++ b/arch/arm64/include/asm/io.h
@@ -135,6 +135,26 @@ extern void __memset_io(volatile void __iomem *, int, size_t);
   #define memcpy_fromio(a,c,l)	__memcpy_fromio((a),(c),(l))
   #define memcpy_toio(c,a,l)	__memcpy_toio((c),(a),(l))
+static inline void __memcpy_toio_64(volatile void __iomem *to, const void *from)
+{
+	const u64 *from64 = from;
+
+	/*
+	 * Newer ARM core have sensitive write combining buffers, it is
+	 * important that the stores be contiguous blocks of store instructions.
+	 * Normal memcpy does not work reliably.
+	 */
+	asm volatile("stp %x0, %x1, [%8, #16 * 0]\n"
+		     "stp %x2, %x3, [%8, #16 * 1]\n"
+		     "stp %x4, %x5, [%8, #16 * 2]\n"
+		     "stp %x6, %x7, [%8, #16 * 3]\n"
+		     :
+		     : "rZ"(from64[0]), "rZ"(from64[1]), "rZ"(from64[2]),
+		       "rZ"(from64[3]), "rZ"(from64[4]), "rZ"(from64[5]),
+		       "rZ"(from64[6]), "rZ"(from64[7]), "r"(to));

Is this correct for big-endian? LDP/STP are kinda tricksy in that regard.

Uh.. I didn't think about it at all..

By no means do I have any skill reading the ARM documents, but I think
it is OK, it says:

Mem[address, dbytes, AccType_NORMAL] = data1;
Mem[address+dbytes, dbytes, AccType_NORMAL] = data2;

So I understand that as

Mem[%8, #16 * 0, 8, AccType_NORMAL] = from64[0]
Mem[%8, #16 * 0 + 1 , 8, AccType_NORMAL] = from64[1]
Mem[%8, #16 * 1, 8, AccType_NORMAL] = from64[2]
Mem[%8, #16 * 1 + 1, 8, AccType_NORMAL] = from64[3]
..

Which is the same on BE/LE?

But I don't know the pitfall to watch for here. This is memcpy so we
don't have to swap, the order of the bits in the register doesn't
matter.

Indeed you're right - all the way back to Armv7 LDRD/STRD, I always get caught out by remembering the path which does an endian-dependent swap of the target registers, but forgetting that that's there to *counteract* the byteswap in Mem[] itself.

Cheers,
Robin.




[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux