Re: [PATCH rdma-next 1/2] arm64/io: add memcpy_toio_64

Jason Gunthorpe <jgg@xxxxxxxxxx> · Fri, 24 Nov 2023 10:20:49 -0400

On Fri, Nov 24, 2023 at 03:10:29PM +0100, Niklas Schnelle wrote:

> What's the reasoning behind not using the existing memcpy_toio()
> here?

Going forward CPUs are implementing an instruction to do a 64 byte
aligned store, this is a wrapper for exactly that operation.

memcpy_toio() is much more general, it allows unaligned buffers and
non-multiples of 64. Adapting the general version to generate the
optimized version in the cases it can is complex and has a codegen
penalty..

> For s390 the above generic variant would do 8 of our special PCI store
> instructions while memcpy_toio() is defined to zpci_memcpy_toio() which
> can do the same as a single PCI store block instruction. Now of course
> we could provide our own memcpy_toio_64() but that would end up the
> same as just doing memcpy_toio(addr, buffer, 64) here.

This is probably better?

Jason