Re: [PATCH 1/4] Consolidate __memcpy_{to,from}io and __memset_io into a single lib

"Arnd Bergmann" <arnd@xxxxxxxx> · Mon, 09 Sep 2024 10:35:58 +0000

On Fri, Sep 6, 2024, at 11:41, Julian Vetter wrote:
> Various architectures have almost the same implementations for
> __memcpy_{to,from}io and __memset_io functions. So, consolidate them and
> introduce a CONFIG_GENERIC_IO flag to build the given lib/io.c.
>
> Reviewed-by: Yann Sionneau <ysionneau@xxxxxxxxxxxxx>
> Signed-off-by: Julian Vetter <jvetter@xxxxxxxxxxxxx>

Thanks for working on this. The implementation looks correct
to me and we should be able to use this on most architectures,
but since this is a shared file, it would be good to make it
as polished as possible.

>  lib/Kconfig  |   3 ++
>  lib/Makefile |   2 +
>  lib/io.c     | 101 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 106 insertions(+)
>  create mode 100644 lib/io.c

I feel the name is a little too geneal, maybe use io_copy.c and
CONFIG_GENERIC_IO_COPY for the name?

Alternatively, this could be part of lib/iomap_copy.c,
which already has some related helper functions. In that
case, I think we would rely on the per-architecture "#define
__memcpy_fromio __memcpy_fromio" etc macros instead of a
Kconfig symbol.

> +void __memcpy_fromio(void *to, const volatile void __iomem *from, size_t count)
> +{
> +	while (count && !IS_ALIGNED((unsigned long)from, NATIVE_STORE_SIZE)) {
> +		*(u8 *)to = __raw_readb(from);
> +		from++;
> +		to++;
> +		count--;
> +	}

There is a corner case that this bit actually misses (same in the
existing implementation): If the __iomem buffer is aligned, but
the 'to' buffer is not, this may cause an alignment fault on
architectures without native unaligned stores. At the moment
I think both csky and some configurations of loongarch64 can
run into this. I think the best way to deal with this is to
use get_unaligned()/put_unaligned() in place of the pointer
dereference. This has no effect on architectures that have
native unaligned access but otherwise uses byte access on the
kernel buffer, which I think is a good tradeoff.

> +	while (count >= NATIVE_STORE_SIZE) {
> +		*(uintptr_t *)to = __raw_read_native(from);
> +		from += NATIVE_STORE_SIZE;
> +		to += NATIVE_STORE_SIZE;
> +		count -= NATIVE_STORE_SIZE;
> +	}

The use of __raw_read_native() somehow bothers me, it seems
a little more complicated than open-coding the two
possibles paths, or even using the aligned-access
helpers like

      if (IS_ENABLED(CONFIG_64BIT))
             __iowrite64_copy(to, from, count & WORD_MASK)
      else
             __iowrite32_copy(to, from, count & WORD_MASK)

Those helpers do require the kernel buffer to be naturally
aligned though.

> +void __memset_io(volatile void __iomem *dst, int c, size_t count)
> +{
> +	uintptr_t qc = (u8)c;
> +
> +	qc |= qc << 8;
> +	qc |= qc << 16;
> +#if IS_ENABLED(CONFIG_64BIT)
> +	qc |= qc << 32;
> +#endif

If you use IS_ENABLED() here, please do it like

       if (IS_ENABLED(CONFIG_64BIT)
              qc |= qc << 32;

      Arnd